Learning more about ICOs

In this blog post, I would like to introduce the project we did at the 5th Unhackathon organized by Data Science Hong Kong. Our team’s project was to look at Initial Coin Offering (ICO) data extracted from ICObench to determine which ICOs are scams. Given we only had a few hours to work on it at the Unhackathon, we focused on data wrangling and visualization to learn more ICOs. After the Unhackathon, I had spent some time to conduct simple analysis of a few additional features. As the data was insufficient to determine which ICOs are scams, I played around with the data to see if there are any patterns on what makes an ICO profitable.

To summarize, I found that –

  • There are a lot of outliers in terms of the return on investment. It is hard to predict which ICOs will be profitable, at least based on the data we got.
  • ICObench provides a rating to each ICO, which is calculated based on ICObench’s algorithm and ratings from “Experts” (certain groups of ICO users). These 2 metrics have rather different rating standard and it seems that more weight is given to ICObench’s algorithm in determining the overall rating.
  • It seems that returns on investment are not strongly related to ratings based on our data. It is also affected by the huge outliers.

Let’s dive in to see take a closer look at ICOs based on our data. I’ll start with cleaning the data followed by analyzing it using visualizations.

1. Importing libraries and reading the data
import pandas as pd
from datetime import datetime
from math import log

import matplotlib.pyplot as plt
import matplotlib.pylab as pylab
pylab.rcParams['figure.figsize'] = 20, 10
%matplotlib inline

import seaborn as sns
sns.set(style="whitegrid", color_codes=True)

df = pd.read_json('icobench_ico_profile_jsonarray.json')

The data set includes information on 1675 ICOs. The data is a snapshot on 3 Feb 2018, the day the data is extracted from ICObench, instead of a time-series data. Here are the features we will focus on in this blog post:

  • categories – the business category of the coin, a coin can be classified under multiple categories.
  • country – country of the ICO.
  • exchanges – exchanges the coin is listed on and the return on investment (ROI) of the coin in that particular exchange. A coin can be listed on multiple exchanges hence have multiple ROI values. Note that the ROI is the total period return between the end date of the ICO and 3 Feb 2018, the day the data was extracted from ICObench.
  • dates – contain key dates of the ICO, such as its start and end date.
  • rating – a combination of ICObench’s rating algorithm and ratings given by independent “Experts”. ICObench users can apply to be an ICObench Expert and ICObench will review and approve the application according to its own criteria.
  • ratingProfile – ratings given by ICObench’s rating algorithm
  • ratingProduct – ratings given by Experts on the product offered by the ICO
  • ratingTeam – ratings given by Experts on the ICO team
  • ratingVision – ratings given by Experts on vision of the ICO company
df.head()
_id about categories country dates exchanges finance id intro links ratingProfile ratingTeam ratingVision ratings registration restrictions tagline team teamIntro url
0 {u’$oid’: u’5a72d64ecdb0be4760d44a2e’} VIBERATE TOKEN CROWDSALE\r\nV… [{u’id’: 9, u’name’: u’Entertainment’}] Slovenia {u’icoEnd’: u’2017-09-05 11:04:42′, u’preIcoEn… [{u’roi’: u’259.26%’, u’name’: u’HitBTC’, u’pr… {u’raised’: 10714285, u’tokentype’: u”, u’bon… 73 Viberate is a crowdsourced live music ecosyste… {u’reddit’: u”, u’medium’: u’https://medium.c… 4.8 4.2 4 [{u’profile’: 0, u’product’: 5, u’name’: u’Chr… unknown [] Token for the live music industry [{u’group’: u’Advisors and ambassadors’, u’nam… https://icobench.com/ico/viberate
1 {u’$oid’: u’5a72e3afcdb0be4760d44eb0′} Changing the Face of Crowdfunding</str… [{u’id’: 17, u’name’: u’Platform’}, {u’id’: 7,… Gibraltar {u’icoEnd’: u’2018-02-19 00:00:00′, u’preIcoEn… [] {u’raised’: 0, u’tokentype’: u’ERC20′, u’bonus… 1692 We’re helping fulfil crowdfunding’s promise of… {u’reddit’: u’https://www.reddit.com/r/TheAcor&#8230; 5 4.5 4.1 [{u’profile’: 0, u’product’: 3, u’name’: u’Tar… kyc and whitelist [{u’country’: u’USA’}, {u’country’: u’China’}] Let’s make crowdfunding free [{u’group’: u”, u’name’: u’Moritz Kurtz’, u’l… https://icobench.com/ico/acorn-collective
2 {u’$oid’: u’5a72e3afcdb0be4760d44eb2′} NapoleonX is the first 100% algorithmic crypto… [{u’id’: 17, u’name’: u’Platform’}] France {u’icoEnd’: u’2018-02-28 00:00:00′, u’preIcoEn… [] {u’raised’: 0, u’tokentype’: u’ERC20′, u’bonus… 54 NaPoleonX is the first 100% algorithmic crypto… {u’reddit’: u’https://www.reddit.com/user/Proj&#8230; 5 4 3.6 [{u’profile’: 0, u’product’: 2, u’name’: u’Joa… unknown [] The first 100% algorithmic crypto asset manager [{u’group’: u’Advisors’, u’name’: u’Jérome de … https://icobench.com/ico/napoleon-x
3 {u’$oid’: u’5a72e779cdb0be4760d459a3′} True Flip is a blockchain lottery platform. We… [{u’id’: 6, u’name’: u’Casino & Gambling’}] Costa Rica {u’icoEnd’: u’2017-07-29 11:00:00′, u’preIcoEn… [{u’roi’: u’-47.03%’, u’name’: u’EtherDelta’, … {u’raised’: 8244292, u’tokentype’: u”, u’bonu… 1 The Token Sale will last for 30 days (or until… {u’reddit’: u’http://reddit.com/r/TrueFlip&#8217;, u… 4.1 2.8 2.8 [{u’profile’: 0, u’product’: 3, u’name’: u’Joa… unknown [] Wordwide Blockchain Lottery [{u’group’: u’Advisory’, u’name’: u’Eric Benz’… TrueFlip advisors and management team cosis… https://icobench.com/ico/true-flip
4 {u’$oid’: u’5a72e779cdb0be4760d459a5′} SunContract’s business model joins together th… [{u’id’: 2, u’name’: u’Energy’}, {u’id’: 17, u… Slovenia {u’icoEnd’: u’2017-08-01 19:00:00′, u’preIcoEn… [{u’roi’: u’1,435.26%’, u’name’: u’HitBTC’, u’… {u’raised’: 2000000, u’tokentype’: u”, u’bonu… 2 SunContract is an energy trading platform that… {u’reddit’: u’https://www.reddit.com/r/suncont&#8230; 4.5 3.1 2.6 [{u’profile’: 0, u’product’: 4, u’name’: u’Chr… unknown [] Decentralized energy market platform [{u’group’: u’Advisors’, u’name’: u’Jonathan G… https://icobench.com/ico/suncontract

5 rows × 26 columns

As you can see, some of the features we are interested in, namely categories, dates and exchanges, are in dictionaries / lists of dictionaries. Let’s extract these features one-by-one.

2. Extract Categories

Here’s an example of the format of the ‘categories’ feature:

df.loc[1]['categories']
[{u'id': 17, u'name': u'Platform'},
 {u'id': 7, u'name': u'Business services'},
 {u'id': 19, u'name': u'Investment'}]
Since there are lots of different categories and some only have very few ICOs under them, let’s consolidate the top 10 categories with the most ICOs according to ICObench and the rest will be classified as “Others”.
# create a list of top 10 categories
cat_list = ['Platform', 'Cryptocurrency', 'Business services', 'Investment', 'Software', 'Entertainment', 
            'Internet', 'Banking', 'Infrastructure', 'Communication']
# extract categories as a list
def get_categories(column):
    if len(column) > 0:
        coin_categories = []
        for item in column:
            if item['name'] in cat_list:
                coin_categories.append(item['name'])
            else:
                coin_categories.append('Others')
    return coin_categories

df['coin_categories'] = df['categories'].apply(get_categories)
df['coin_categories'].head()
Voila! The categories of each ICOs are extracted as a list under the new feature “coin_categories”.
0                              [Entertainment]
1    [Platform, Business services, Investment]
2                                   [Platform]
3                                     [Others]
4                           [Others, Platform]
Name: coin_categories, dtype: object
Then, we convert coin_categories into dummy variables for mathematical analysis.
# add the new category, "Others", into the list of categories.
cat_list.append('Others')
# Convert coin_categories to dummy variables
def dummy_categories(column, category):
    for i in column['coin_categories']: 
        if len(column['coin_categories']) > 0:
            category_dummy = 0
            if category in column['coin_categories']:
                    category_dummy = 1
    return category_dummy

for cat in cat_list:
    df[cat] = df.apply(lambda x: dummy_categories(x, cat), axis = 1)

Now we have extracted the ICO categories and converted them into dummy variables using one-hot encoding. Here’s how our data looks like now:

df.iloc[0:3, 27:39]
Platform Cryptocurrency Business services Investment Software Entertainment Internet Banking Infrastructure Communication Others
0 0 0 0 0 0 1 0 0 0 0 0
1 1 0 1 1 0 0 0 0 0 0 0
2 1 0 0 0 0 0 0 0 0 0 0
3. Extract Countries

First, let’s look at the unique values under “country”.

df['country'].unique()
array([u'Slovenia', u'Gibraltar', u'France', u'Costa Rica',
       u'South Africa', u'USA', u'China', u'Netherlands', u'Canada', u'UK',
       u'Switzerland', u'Singapore', u'Cayman Islands', u'Belgium',
       u'Malta', u'Poland', u'Russia', u'Panama', u'Israel', u'Italy',
       u'Latvia', u'Japan', u'Germany', u'Bulgaria', u'', u'Hong Kong',
       u'Estonia', u'Argentina', u'Marshall Islands',
       u'United Arab Emirates', u'Belize', u'Georgia', u'Brazil',
       u'Ukraine', u'Taiwan', u'India', u'Sweden', u'Malaysia',
       u'Seychelles', u'Australia', u'Austria', u'Lithuania', u'Finland',
       u'Nigeria', u'Kyrgyzstan', u'Spain', u'Philippines',
       u'Virgin Islands', u'Cyprus', u'Isle of Man', u'Kazakhstan',
       u'Czech Republic', u'Romania', u'Chile', u'Armenia', u'Thailand',
       u'Luxembourg', u'Turkey', u'South Korea', u'Cambodia', u'Belarus',
       u'Serbia', u'British Virgin Islands', u'Moldova', u'Monaco',
       u'Mexico', u'Colombia', u'Greece', u'Denmark', u'Indonesia',
       u'Dominican Republic', u'Ireland', u'Slovakia', u'Kenya',
       u'SIngapore', u'Sierra Leone', u'Vietnam', u'Liechtenstein',
       u'Portugal', u'Guinea-Bissau', u'Dubai', u'Mauritius', u'Norway',
       u'Afghanistan', u'Croatia', u'Saint Lucia', u'Andorra', u'Jersey',
       u'Vanuatu', u'Turks and Caicos Islands', u'Guatemala',
       u'New Zealand', u'UNITED STATE', u'Worldwide', u'London',
       u'Hungary', u'United States', u'Curacao', u'Slovenija',
       u'UK, Poland', u'Tanzania', u'Pakistan', u'The Netherlands',
       u'United Kingdom', u'Switzerland,'], dtype=object)
It appears that different names are used for the same country, for example –
  • USA, UNITED STATE and United States
  • Slovenia and Slovenija
  • UK, United Kingdom and London
  • Netherlands and The Netherlands
  • Switzerland and Switzerland, (an extra comma in the end)
  • Singapore and SIngapore

Besides, note that there’s a unique value “UK, Poland”. This will be converted to “Worldwide”, one of the existing values under this feature. There are also some null values for “country”.

df = df.replace(to_replace={'country': {'UNITED STATE': 'USA', 'United States': 'USA', 'Slovenija': 'Slovenia',
                                       'United Kingdom': 'UK', 'London': 'UK', 'The Netherlands': 'Netherlands', 
                                        'Switzerland,': 'Switzerland', 'SIngapore': 'Singapore', 'UK, Poland': 'Worldwide'}})
Since there are lots of diffferent countries, let’s also consolidate them by only extracting the countries with the most ICOs and classifying the rest as “Other Countries”.
country_count = df.groupby('country').count()[['id']]

country_count.sort_values(by = ['id'], ascending = False).head(15)
id
country
USA 290
Russia 214
154
UK 137
Singapore 100
Switzerland 65
Estonia 54
Hong Kong 40
Canada 37
Germany 35
Slovenia 26
Australia 25
Gibraltar 23
Netherlands 19
Ukraine 18
# extract the countries with >100 ICOs
major_country = ['USA', 'Russia', 'UK', 'Singapore']
# classifyy the rest as "Other Countries"
def group_countries(column):
    if len(column) > 0:
        if column not in major_country:
            country_group = 'Other Countries'
        else:
            country_group = column
    else:
        country_group = "N/A"
    return country_group

df['country'] = df['country'].apply(group_countries)
3. Calculate Average ROI & Annualized Average ROI

As some of the coins are listed on multiple exchanges, an average ROI is calculated for each coin. Besdies, since the period of the ROI is different for each coins (different icoEnd date, hence different duration between icoEnd date and the data extraction date), the ROI will be annualized for fair comparison.
(**Note that the following functions were written by my team member at the Unhackathon. I’ve only done a few textual amendments to them for this blog post.)

# calculate the average ROI
def calc_roi_avg(column):
    if len(column) > 0:
        roi_total = 0
        for item in column: 
            if item['roi']:
                roi_total = roi_total + float(item['roi'].strip('%').replace(",", "")) / 100
        roi_avg = roi_total / len(column)
        return roi_avg
# annualize the average ROI
def calc_ret_anul(column):
    end_date = datetime.strptime('2018-02-03 00:00:00', '%Y-%m-%d %H:%M:%S')
    if 'icoEnd' in column['dates']:
        try:
            # convert to datetime format
            ico_end_date = datetime.strptime(column['dates']['icoEnd'], '%Y-%m-%d %H:%M:%S')
            date_diff_days = (end_date - ico_end_date).days            
            if date_diff_days > 0:            
                return_anul = pow((1 + float(column['roi_avg'])), 365/date_diff_days) - 1
            else:
                # no return if the ICO hasn't ended yet (ico_end_date > end_date)
                return_anul = None
        except:
            return_anul = None
    return return_anul
df['roi_avg'] = df['exchanges'].apply(calc_roi_avg)
df['roi_avg_annu'] = df.apply(calc_ret_anul, axis=1) 
4. Exploratory Data Analysis

After we have extracted the features, we can now explore the data to learn more about ICOs! Let’s see if it gives us insights on investing in profitable ICOs.

# extract annualized average ROI, country, ratings and categories
df2 = df[['id', 'name','roi_avg_annu','country', 'rating', 'ratingProduct', 'ratingProfile', 'ratingTeam', 'ratingVision',
          'Platform', 'Cryptocurrency', 'Business services', 'Investment', 'Software', 'Entertainment','Internet',
          'Banking', 'Infrastructure', 'Communication', 'Others']]
df2.describe(include = 'all')
id name roi_avg_annu country rating ratingProduct ratingProfile ratingTeam ratingVision Platform Cryptocurrency Business services Investment Software Entertainment Internet Banking Infrastructure Communication Others
count 1675.000000 1675 3.510000e+02 1675 1675.000000 1675.000000 1675.000000 1675.000000 1675.000000 1675.000000 1675.000000 1675.000000 1675.000000 1675.000000 1675.000000 1675.000000 1675.000000 1675.000000 1675.000000 1675.000000
unique NaN 1671 NaN 6 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
top NaN Hero NaN Other Countries NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
freq NaN 3 NaN 780 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
mean 885.047761 NaN 2.029392e+19 NaN 3.066209 1.796060 3.199821 1.936657 1.965313 0.508060 0.349851 0.236418 0.167164 0.128955 0.109851 0.091940 0.085373 0.068657 0.058507 0.358806
std 519.703534 NaN 3.802065e+20 NaN 0.902210 1.701323 1.067392 1.814269 1.827942 0.500084 0.477065 0.425009 0.373234 0.335250 0.312797 0.289028 0.279520 0.252945 0.234771 0.479793
min 1.000000 NaN -1.000000e+00 NaN 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
25% 433.500000 NaN 0.000000e+00 NaN 2.400000 0.000000 2.400000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
50% 879.000000 NaN 0.000000e+00 NaN 3.300000 2.000000 3.300000 2.000000 2.200000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
75% 1335.500000 NaN 8.960773e-01 NaN 3.800000 3.300000 4.100000 3.700000 3.700000 1.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000
max 1845.000000 NaN 7.123166e+21 NaN 4.900000 5.000000 5.000000 5.000000 5.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000
Observations:
  1. There are lots of missing values for annualized average ROI (roi_avg_annu). After further examining the data, I found that the values are missing due to the following reasons:
    • ICO has not ended yet
    • ICO end date is unknown
    • Although the ICO has ended, the coin was not listed on any exchanges. (Could this be a flag that this ICO is a scam?)
  2. The distribution of annualized average ROI is extremely right skewed. Over half of the values are negative or low (min: -100%, 50th percentile: 0%) while the top performers have significantly higher returns, with the 75th percentile reaching 90% and the return of the top performer is 7*10^23%! (WHAT??!!) We’ll take a look at these top performers later in this blog post.
  3. The rating standards of ICObench’s algorithm and “Experts” seem to be rather different. The mean of ratingProfile, i.e. the rating calculated by the algorithm(3.2), is higher than mean of the other ratings given by the Experts (1.8-2.0). “rating”, the combined metric from the algorithm and Experts, seems to be similar to the ratingProfile. It appears that the rating assigned by the algorithm is allocated much more weight than the Expert ratings.
4.1 Outliers – Which are the top performers?
df2.sort_values(by = ['roi_avg_annu'], ascending = False).head(10)
id name roi_avg_annu country rating ratingProduct ratingProfile ratingTeam ratingVision Platform Cryptocurrency Business services Investment Software Entertainment Internet Banking Infrastructure Communication Others
1010 1062 Emiratecoin 7.123166e+21 Other Countries 0.8 0.0 0.8 0.0 0.0 1 1 1 1 1 0 1 0 0 0 0
248 256 Circles 4.763307e+08 N/A 3.4 3.0 3.5 3.0 3.0 1 0 0 0 0 1 0 0 0 0 1
664 692 Bank Of Memories 3.283118e+08 Russia 3.3 2.0 4.0 2.5 2.5 0 1 0 0 0 0 0 0 0 1 0
681 710 AppCoins 4.810249e+07 Singapore 3.9 3.3 5.0 3.9 3.8 1 0 1 0 1 0 0 0 0 0 0
513 531 NEO 1.799668e+07 Other Countries 4.2 5.0 3.6 5.0 5.0 1 1 0 0 0 0 0 0 0 0 0
864 905 tokens.net 1.289177e+05 UK 3.1 1.9 3.4 3.6 3.1 1 1 1 0 0 0 0 0 0 0 0
1393 1486 Nebulas 1.126542e+05 USA 4.6 0.0 4.6 0.0 0.0 1 1 0 0 0 0 0 0 0 0 0
573 592 Genesis Vision 3.373078e+04 Russia 3.8 3.2 4.9 3.5 3.5 1 1 1 1 0 0 0 1 0 0 0
287 295 Cindicator 8.086074e+03 Russia 4.0 3.5 4.8 3.7 3.6 0 1 0 1 0 0 0 0 0 0 0
211 216 Neblio 5.947142e+03 USA 3.3 0.0 3.3 0.0 0.0 0 1 1 0 0 0 0 0 0 0 0

Interesting. The top performer, Emiratecoin, in fact received very low ratings. Looks like we shouldn’t solely rely on ICObench ratings to decide our ICO investments.

4.2 ICOs by country
sns.countplot(x="country", data=df2, order = df2['country'].value_counts().index);
 ICO1

USA has the most ICOs, followed by Russia. In fact, the geographical distribution of ICOs is rather dispersed given around 50% of them (over 700 out of 1675) are from countries that have less than 100 ICOs, a truly international investment opportunity!

4.3 ICOs by categories
As we have converted the categories into dummy variables, summing them up by columns gives the count of ICOs that belong to each categories (since those belonging to that category were encoded as “1”). This graph shows that over 50% of the ICOs are platforms.
cat_sum = df2[['Platform', 'Cryptocurrency', 'Business services', 'Investment', 'Software', 'Entertainment','Internet', 
    'Banking', 'Infrastructure', 'Communication', 'Others']].sum()

plt.subplots(figsize = (20, 8))
sns.barplot(x=cat_sum.index, y=cat_sum.values);
 ICO2
4.3.1 ICO ratings by categories
# obtain the mean by categories for each type of ratings
category_rating = []
product_rating = []
profile_rating = []
team_rating = []
vision_rating = []
for cat in cat_list:
    category_rating.append(df2.groupby(cat).mean()[['rating']].get_value(1, 'rating'))
    product_rating.append(df2.groupby(cat).mean()[['ratingProduct']].get_value(1, 'ratingProduct'))
    profile_rating.append(df2.groupby(cat).mean()[['ratingProfile']].get_value(1, 'ratingProfile'))
    team_rating.append(df2.groupby(cat).mean()[['ratingTeam']].get_value(1, 'ratingTeam'))
    vision_rating.append(df2.groupby(cat).mean()[['ratingVision']].get_value(1, 'ratingVision'))


# plot bar graphs
fig, ((ax1), (ax2), (ax3), (ax4), (ax5)) = plt.subplots(5, 1, figsize = [15,13])
plt.tight_layout(pad=2, h_pad=None, w_pad=None, rect=None)

sns.barplot(x=cat_list, y=category_rating, ax = ax1).set_title(
    "Rating by categories (Combination of ICOBench Algo and Expert)");
sns.barplot(x=cat_list, y=product_rating, ax = ax2).set_title("Rating by Experts on Product");
sns.barplot(x=cat_list, y=team_rating, ax = ax3).set_title("Rating by Experts on the Team");
sns.barplot(x=cat_list, y=vision_rating, ax = ax4).set_title("Rating by Experts on the Vision");
sns.barplot(x=cat_list, y=profile_rating, ax = ax5).set_title("Rating by ICOBench's Algorithm");
ICO3

Previously, we noticed that ICObench’s algorithm gave higher rating than the Experts in general. From these graphs, we observe that ratings given by ICObench’s algorithm has less variation across categories than those given by Experts. Looking at the “Investment” category, one of the lowest rated categories, we observe that the Experts have given it much lower ratings than other categories on all product, team and vision while ICOBench’s algorithm only gave it slightly lower rating than the other categories. It would be interesting to find out what is causing this difference. Is it because Experts, or people in general, are more skeptical about certain types of ICO hence gave those a lower rating? Or does ICOBench’s algorithm need to be fine-tuned in order to better distinguish different categories when assiging a rating? These graphs also show that the combined overall rating is affected more by ICObench’s algorithm. If Experts’ ratings are in fact more accurate in predicting an ICOs ROI, higher weights should be assigned to them when calculating the combined overall rating.

4.3.2 ICO ROI by categories
Last but not least, let’s take a look at the ROIs of ICOs.
plt.hist(df2['roi_avg_annu'].dropna(), bins = 10)
plt.show()
 ICO4
Whoa, as we observe from the .describe() function, the annualized average ROI is extremely right-skewed. The log function can be used to smooth it out a little bit. Then, we will plot a bar graph to study the ROI of different categories.
min_roi = df2['roi_avg_annu'].min()
# apply lambda function to convert all roi_avg_annu to positive values for applying log function
df2['log_roi'] = df2['roi_avg_annu'].apply(lambda x: log(x - min_roi + 1))
plt.hist(df2['log_roi'].dropna())
plt.show()
ICO5
# plotting a bar graph of mean annualized average ROI of different categories
category_mean_roi_log = []

for cat in cat_list:
    category_mean_roi_log.append(df2.groupby(cat).mean()[['log_roi']].get_value(1, 'log_roi'))
plt.subplots(figsize = (15,6))
sns.barplot(x = cat_list, y = category_mean_roi_log);
ICO6

It’s interesting to note that Internet, one of the categories that received lowest ratings from Experts, actually had the highest annualized average ROI. Seems like we can’t solely rely on ratings afterall.

5. Conclusion

Predicting ICO returns and more so, ICO scams, is complicated and requires more understanding of the ICOs business model and the overall ICO landscape and market sentiment. We couldn’t really figure out how to distinguish profitable ICOs and scams within the few hour at the Unhackathon using the limited amount of data we had. Here’s what we could do if we had more time:

  • look into other features available in the data set, for example soft and hard cap of the ICO, amount of funding raised, etc
  • obtain time series data of ICOs to study the patterns of price and returns
  • scrape data from ICO news websites, the founder’s Linkedin profile and the ICO’s white paper to learn more about the business model and the team instead of solely relying on ratings

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s