You may have seen my previous post that tried to predict bitcoin and ethereum prices with deep learning. To summarise, there was alot of hype but it wasn’t very useful in practice (I’m referring to the model, of course). To improve the model, we have two options: carefully design an intricately more sophisticated model (i.e. throw shit tons more layers in there) or identify more informative data sources that can be fed into the model. While it’s tempting to focus on the former, the garbage-in-garbage-out principle remains.

With that in mind, I created a new Python package called cryptory. Not to be confused with the obscure Go repo (damn you, mtamer) or that bitcoin scam (you try to come up with a crypto package name that isn’t associated with some scam), it integrates various packages and protocols so that you can get historical crypto (just daily… for now) and wider economic/social data in one place. Rather than making more crypto based jokes, I should probably just explain the package.

As always, the full code for this post can found on my GitHub account.

Installation

cryptory is available on PyPi and GitHub, so installing it is as easy as running pip install cryptory in your command line/shell.

It relies on pandas, numpy, BeautifulSoup and pytrends, but, if necesssary, these packages should be automatically installed alongisde cryptory.

The next step is to load the package into the working environment. Specifically, we’ll import the Cryptory class.

# import package
from cryptory import Cryptory

Assuming that returned no errors, you’re now ready to starting pulling some data. But before we do that, it’s worth mentioning that you can retrieve information about each method by running the help function.

help(Cryptory)
Help on class Cryptory in module cryptory.cryptory:

class Cryptory
 |  Methods defined here:
 |  
 |  __init__(self, from_date, to_date=None, ascending=False, fillgaps=True, timeout=10.0)
 |      Initialise cryptory class
 ...

We’ll now create our own cryptory object, which we’ll call my_cryptory. You need to define the start date of the data you want to retrieve, while there’s also some optional arguments. For example, you can set the end date, otherwise it defaults to the current date- see help(Cryptory.__init__) for more information).

# initialise object
my_cryptory = Cryptory(from_date="2017-01-01")

Cryptocurrency Prices

We’ll start by getting some historical bitcoin prices (starting from 1st Jan 2017). cryptory has a few options for this type of data, which I will now demonstrate.

# get prices from coinmarketcap
my_cryptory.extract_coinmarketcap("bitcoin")
date open high low close volume market cap
0 2018-02-10 8720.08 9122.55 8295.47 8621.90 7780960000 146981000000
1 2018-02-09 8271.84 8736.98 7884.71 8736.98 6784820000 139412000000
2 2018-02-08 7637.86 8558.77 7637.86 8265.59 9346750000 128714000000
... ... ... ... ... ... ... ...
403 2017-01-03 1021.60 1044.08 1021.60 1043.84 185168000 16426600000
404 2017-01-02 998.62 1031.39 996.70 1021.75 222185000 16055100000
405 2017-01-01 963.66 1003.08 958.70 998.33 147775000 15491200000
# get prices from bitinfocharts
my_cryptory.extract_bitinfocharts("btc")
date btc_price
0 2018-02-10 8691.000
1 2018-02-09 8300.000
2 2018-02-08 8256.000
... ... ...
403 2017-01-03 1017.000
404 2017-01-02 1010.000
405 2017-01-01 970.988

Those cells illustrate how to pull bitcoin prices from coinmarketcap and bitinfocharts. The discrepancy in prices returned by each can be explained by their different approaches to calculate daily prices (e.g. bitinfocharts represents the average prices across that day). For that reason, I wouldn’t recommend combining different price sources.

You also pull non-price specific data with extract_bitinfocharts e.g. transactions fees. See help(Cryptory.extract_bitinfocharts) for more information.

# average daily eth transaction fee
my_cryptory.extract_bitinfocharts("eth", metric='transactionfees')
date eth_transactionfees
0 2018-02-10 0.78300
1 2018-02-09 0.74000
2 2018-02-08 0.78300
... ... ...
403 2017-01-03 0.00773
404 2017-01-02 0.00580
405 2017-01-01 0.00537

You may have noticed that each method returns a pandas dataframe. In fact, all cryptory methods return a pandas dataframe. This is convenient, as it allows you to slice and dice the output using common pandas techniques. For example, we can easily merge two extract_bitinfocharts calls to combine daily bitcoin and ethereum prices.

my_cryptory.extract_bitinfocharts("btc").merge(
my_cryptory.extract_bitinfocharts("eth"), on='date', how='inner')
date btc_price eth_price
0 2018-02-10 8691.000 871.238
1 2018-02-09 8300.000 832.564
2 2018-02-08 8256.000 814.922
... ... ... ...
403 2017-01-03 1017.000 8.811
404 2017-01-02 1010.000 8.182
405 2017-01-01 970.988 8.233

One further source of crypto prices is offered by extract_poloniex, which pulls data from the public poloniex API. For example, we can retrieve the BTC/ETH exchange rate.

# btc/eth price
my_cryptory.extract_poloniex(coin1="btc", coin2="eth")
date close open high low weightedAverage quoteVolume volume
0 2018-02-10 0.099700 0.100961 0.101308 0.098791 0.100194 2.160824e+04 2165.006520
1 2018-02-09 0.101173 0.098898 0.101603 0.098682 0.100488 2.393343e+04 2405.019824
2 2018-02-08 0.098896 0.099224 0.101196 0.096295 0.098194 2.250954e+04 2210.293015
... ... ... ... ... ... ... ... ...
403 2017-01-03 0.009280 0.008218 0.009750 0.008033 0.009084 1.376059e+06 12499.794908
404 2017-01-02 0.008220 0.008199 0.008434 0.007823 0.008101 6.372636e+05 5162.784640
405 2017-01-01 0.008200 0.008335 0.008931 0.008001 0.008471 7.046517e+05 5968.975870

We’re now in a position to perform some basic analysis of cryptocurrencies prices.

Of course, that graph is meaningless. You can’t just compare the price for single units of each coin. You need to consider the total supply and the market cap. It’s like saying the dollar is undervalued compared to the Japanese Yen. But I probably shouldn’t worry. It’s not as if people are buying cryptos based on them being superficially cheap. More relevant here is the relative change in price since the start of 2017, which we can plot quite easily with a little pandas magic (pct_change).

Those coins are provided on bitinfocharts and they tend to represent older legacy coins. For example, the coin from this list that performed best over 2017 was Reddcoin. It started 2017 with a market cap of less than 1 million dollars, but finished it with a value of around $250m, reaching a peak of over 750m in early Jan 2018. You’ll notice that each coin shows the same general behaviour- a sustained rise between March and June, followed by another spike in December and a noticeable sell-off in Jan 2018.

With a little help from pandas, we can produce a crypto price correlation plot (use the dropdown menu to switch between Pearson and Spearman correlation).

result.png

There’s nothing too surprising (or novel) here. It’s well known that cryptos are heavily correlated- they tend to spike and crash collectively. There’s a few reasons for this: Most importantly, the vast majority of coins can only be exchanged with the few big coins (e.g. btc and eth). As they are priced relative to these big coins, a change in btc or eth will also change the value of those smaller coins. Secondly, it’s not like the stock market. Ethereum and Bitcoin are not as different as, say, Facebook and General Motors. While stock prices are linked to hitting financial targets (i.e. quarterly earnings reports) and wider macroeconomic factors, most cryptos (maybe all) are currently powered by hope and aspirations (well, hype and speculation) around blockchain technology. That’s not to say coins can’t occasionally buck the market e.g. ripple (xrp) in early December. However, overperformance is often followed by market underperformance (e.g. ripple in January 2018).

I’ll admit nothing I’ve presented so far is particularly ground breaking. You could get similar data from the Quandl api (aside: I intend to integrate quandl API calls into cryptory). The real benefit of cryptory comes when you want to combine crypto prices with other data sources.

Reddit Metrics

If you’re familiar with cryptos, you’re very likely to be aware of their associated reddit pages. It’s where crypto investors come to discuss the merits of different blockchain implementations, dissect the day’s main talking points and post amusing gifs- okay, it’s mostly just GIFs. With cryptory you can combine reddit metrics (total number of subscribers, new subscribers, rank -literally scraped from the redditmetrics website) and other crypto data.

Let’s take a look at iota and eos; two coins that emerged in June 2017 and experienced strong growth towards the end of 2017. Their corresponding subreddits are r/iota and r/eos, respectively.

my_cryptory.extract_reddit_metrics("iota", "subscriber-growth")
date subscriber_growth
0 2018-02-10 150
1 2018-02-09 161
2 2018-02-08 127
... ... ...
404 2017-01-03 0
405 2017-01-02 0
406 2017-01-01 0

Now we can investigate the relationship between price and subreddit growth.

Visually speaking, there’s clearly some correlation between price and subreddit member growth (the y-axis was normalised using the conventional min-max scaling). While the Spearman rank correlation is similarly high for both coins, the Pearson correlation coefficient is significantly stronger for iota, highlighting the importance of not relying on one single measure. At the time of writing, iota and eos both had a marketcap of about $5bn (11th and 9th overall), though the number of subscribers to the iota subreddit was over 3 times more than the eos subreddit (105k and 30k, respectively). While this doesn’t establish whether the relationship between price and reddit is predictive or reactive, it does suggest that reddit metrics could be useful model features for some coins.

You’ll notice an almost simultaneous spike in suscribers to the iota and eos subreddits in late November and early December. This was part of a wider crypto trend, where most coins experienced unprecendented gains. Leading the charge was Bitcoin, which tripled in price between November 15th and December 15th. As the most well known crypto to nocoiners, Bitcoin (and the wider blockchain industry) received considerable mainstream attention during this bull run. Presumably, this attracted quite alot of new crypto investors (i.e gamblers), which propelled the price even higher. Well, what’s the first thing you’re gonna do after reading an article about this fancy futuristic blockchain that’s making people rich?. You’d google bitcoin, ethereum and obviously bitconnect.

With cryptory, you can easily combine conventional crypto metrics with Google Trends data. You just need to decide the terms you want to search. It’s basically a small wrapper on top of the pytrends package. If you’ve used Google Trends before, you’ll be aware that you can only retrieve daily scores for max 90 day periods. The get_google_trends method stitches together overlapping searches, so that you can pull daily scores going back years. It’s probably best to illustrate it with a few examples.

my_cryptory.get_google_trends(kw_list=['bitcoin'])
date bitcoin
0 2018-02-09 22.000000
1 2018-02-08 25.000000
2 2018-02-07 30.000000
... ... ...
402 2017-01-03 3.974689
403 2017-01-02 4.377918
404 2017-01-01 2.707397

Now we can investigate the relationship between crypto price and google search popularity.

As before, it’s visually obvious and statisically clear that there’s a strong correlation between google searches and coin prices. Again, this a well known observation (here, here and here). What’s not so apparent is whether google search drives or follows the price. That chicken and egg question question will be addressed in my next deep learning post.

A few words on Verge (xvg): eccentric (i.e. crazy) crypto visionary John McAfee recommended (i.e. shilled) the unheralded Verge to his twitter followers (i.e. fools), which triggered a huge surge in its price. As is usually the case with pump and dumps, the pump (from which McAfee himself potentially profitted) was followed by the dump. The sorry story is retold in both the price and google search popularity. Unlike bitcoin and ethereum though, you’d need to consider in your analysis that verge is also a common search term for popular online technology news site The Verge (tron would be a similar case).

Anyway, back to cryptory, you can supply more than one keyword at a time, allowing you to visualise the relative popularity of different terms. Let’s go back to the early days and compare the historical popularity of Kim Kardashian and Bitcoin since 2013.

According to Google Trends, bitcoin became a more popular search term in June 2017 (a sure sign of a bubble if ever there was one- just realised this isn’t a unique insight either). That said, Bitcoin has never reached the heights of Kim Kardashian on the 13th November 2014 (obviously, the day Kim Kardashian broke the internet). The graph shows daily values, but you’ll notice that it quite closely matches what you’d get for the same weekly search on the Google Trends website.

While social metrics like reddit and google popularity can be powerful tools to study cryptocurrency prices, you may also want to incorporate data related to finance and the wider global economy.

Stock Market Prices

With their market caps and closing prices, cryptocurrencies somewhat resemble traditional company stocks. Of course, the major difference is that you couldn’t possibly pay for a lambo by investing in the stock market. Still, looking at the stock market may provide clues as to how the general economy is performing, or even how specific industries are responding to the blockchain revolution.

cryptory includes a get_stock_prices method, which scrapes yahoo finance and returns historical daily data. Just note that you’ll need to find the relevant company/index code on the yahoo finance website.

# %5EDJI = Dow Jones
my_cryptory.get_stock_prices("%5EDJI")
date adjclose close high low open volume
0 2018-02-10 24190.900391 24190.900391 24382.140625 23360.289062 23992.669922 735030000.0
1 2018-02-09 24190.900391 24190.900391 24382.140625 23360.289062 23992.669922 735030000.0
2 2018-02-08 23860.460938 23860.460938 24903.679688 23849.230469 24902.300781 657500000.0
... ... ... ... ... ... ... ...
403 2017-01-03 19881.759766 19881.759766 19938.529297 19775.929688 19872.859375 339180000.0
404 2017-01-02 NaN NaN NaN NaN NaN NaN
405 2017-01-01 NaN NaN NaN NaN NaN NaN

You may notice the previous closing prices are carried over on days the stock market is closed (e.g. weekends). You can choose to turn off this feature when you initialise your cryptory class (see help(Cryptort.__init__)).

With a little help from pandas, we can visualise the performance of bitcoin relative to some specific stocks and indices.

This graph shows the return you would have received if you had invested on January 3rd. As Bitcoin went up the most (>10x returns), it was objectively the best investment. While the inclusion of some names is hopefully intuitive enough, AMD and NVIDIA (and Intel to some extent) are special cases, as these companies produce the graphics cards that underpin the hugely energy intensive (i.e. wasteful) process of crypto mining. Kodak (not to be confused with the pre 2012 bankruptcy Kodak) made the list, as they announced their intention in early Jan 2018 to create their own “photo-centric cryptocurrency” (yes, that’s what caused that blip).

As before, with a little bit of pandas work, you can create a bitcoin stock market correlation plot.

result.png

The highest correlation recorded (0.75) is between Google and Nasdaq, which is not surprising, as the former is large component of the latter. As for Bitcoin, it was most correlated with Google (0.12), but its relationship with the stock market was generally quite weak.

Commodity Prices

While Bitcoin was originally envisioned as alternative system of payments, high transaction fees and rising value has discouraged its use as a legitimate currency. This has meant that Bitcoin and its successors have morphed into an alternative store of value- a sort of easily lost internet gold. So, it may be interesting to investigate the relationship between Bitcoin and the more traditional stores of value.

cryptory includes a get_metal_prices method that retrieves historical daily prices of various precious metals.

my_cryptory.get_metal_prices()
date gold_am gold_pm silver platinum_am platinum_pm palladium_am palladium_pm
0 2018-02-10 1316.05 1314.10 16.345 972.0 969.0 970.0 969.0
1 2018-02-09 1316.05 1314.10 16.345 972.0 969.0 970.0 969.0
2 2018-02-08 1311.05 1315.45 16.345 974.0 975.0 990.0 985.0
... ... ... ... ... ... ... ... ...
403 2017-01-03 1148.65 1151.00 15.950 906.0 929.0 684.0 706.0
404 2017-01-02 NaN NaN NaN NaN NaN NaN NaN
405 2017-01-01 NaN NaN NaN NaN NaN NaN NaN

Again, we can easily plot the change in commodity over 2017 and 2018.

Look at silly old gold appreciating slowly over 2017 and 2018, thus representing a stable store of wealth. As before, we can plot a price correlation matrix.

result.png

Unsurprisingly, the various precious metals exhibit significant correlation, while bitcoin value appears completely unconnected. I suppose negative correlation could have provided evidence that people are moving away from traditional stores of value, but there’s little evidence to support this theory.

Foreign Exchange Rates

One of the motivations behind Bitcoin was to create a currency that wasn’t controlled by any central authority. There could be no quantitative easing- when the US Central Bank devalued the dollar by essentially printing trillions of new dollars to prop up the faltering economy after the 2007 financial crisis. As such, there may be a relationship between USD exchange rate (which would be devalued by such policies) and money moving into cryptocurrencies.

cryptory includes a get_exchange_rates method that retrieves historical daily exchange rate between particular currency pairs.

my_cryptory.get_exchange_rates(from_currency="USD", to_currency="EUR")
date exch_rate
0 2018-02-10 1.2273
1 2018-02-09 1.2273
2 2018-02-08 1.2252
... ... ...
403 2017-01-03 1.0385
404 2017-01-02 NaN
405 2017-01-01 NaN

As you can see, the USD has lost ground to the Euro over the last year. We can easily add a few more USD exchange rates (spoiler alert:the USD has depreciated relative to most major currencies). As the results are similar to the precious metals, that code can be found in the Jupyter notebook.

Oil Prices

Oil prices are strongly affected by the strength of the global economy (e.g. demand in China) and geopolitical instability (e.g. Middle East, Venezuela). Of course, there’s other factors at play (shale, moves towards renewables, etc.), but you might want to have oil prices in your crypto price model in order to include these forces.

cryptory includes a get_oil_prices method that retrieves historical daily oil (London Brent Crude) prices.

my_cryptory.get_oil_prices()
date oil_price
0 2018-02-10 64.18
1 2018-02-09 64.18
2 2018-02-08 64.18
... ... ...
403 2017-01-03 52.36
404 2017-01-02 NaN
405 2017-01-01 NaN

As you can see, oil is up about 20% since the start of 2017. Of course, you can plot the price over a longer time period.

Future

So what’s the future of cryptos? Moon, obviously! As for the future of cryptory, it already includes numerous tools that could improve price models (particularly, reddit and google trend metrics). But it’s certainly lacking features that would take it to the moon:

  • twitter statistics (specifically John McAffee’s!!!)
  • media analysis (number of mainstream articles, sentiment, etc.- example)
  • more Asian-centric data sources (Japan and South Korea are said to account for 40% and 20% of global bitcoin volume, respectively)
  • more financial/crypto data (integrate Quandl api)

In my next post, I’ll use cryptory to (hopefully) improve the previous LSTM crypto price prediction model. While you wait for that, you can perform your own cryptocurrency analysis with the accompanying Jupyter notebook. Thanks for reading!

Leave a Comment