You may have seen my previous post that tried to predict bitcoin and ethereum prices with deep learning. To summarise, there was alot of hype but it wasn’t very useful in practice (I’m referring to the model, of course). To improve the model, we have two options: carefully design an intricately more sophisticated model (i.e. throw shit tons more layers in there) or identify more informative data sources that can be fed into the model. While it’s tempting to focus on the former, the garbage-in-garbage-out principle remains.
With that in mind, I created a new Python package called cryptory. Not to be confused with the obscure Go repo (damn you, mtamer) or that bitcoin scam (you try to come up with a crypto package name that isn’t associated with some scam), it integrates various packages and protocols so that you can get historical crypto (just daily… for now) and wider economic/social data in one place. Rather than making more crypto based jokes, I should probably just explain the package.
As always, the full code for this post can found on my GitHub account.
It relies on pandas, numpy, BeautifulSoup and pytrends, but, if necesssary, these packages should be automatically installed alongisde cryptory.
The next step is to load the package into the working environment. Specifically, we’ll import the
# import package from cryptory import Cryptory
Assuming that returned no errors, you’re now ready to starting pulling some data. But before we do that, it’s worth mentioning that you can retrieve information about each method by running the
Help on class Cryptory in module cryptory.cryptory: class Cryptory | Methods defined here: | | __init__(self, from_date, to_date=None, ascending=False, fillgaps=True, timeout=10.0) | Initialise cryptory class ...
We’ll now create our own cryptory object, which we’ll call
my_cryptory. You need to define the start date of the data you want to retrieve, while there’s also some optional arguments. For example, you can set the end date, otherwise it defaults to the current date- see
help(Cryptory.__init__) for more information).
# initialise object my_cryptory = Cryptory(from_date="2017-01-01")
We’ll start by getting some historical bitcoin prices (starting from 1st Jan 2017).
cryptory has a few options for this type of data, which I will now demonstrate.
# get prices from coinmarketcap my_cryptory.extract_coinmarketcap("bitcoin")
# get prices from bitinfocharts my_cryptory.extract_bitinfocharts("btc")
Those cells illustrate how to pull bitcoin prices from coinmarketcap and bitinfocharts. The discrepancy in prices returned by each can be explained by their different approaches to calculate daily prices (e.g. bitinfocharts represents the average prices across that day). For that reason, I wouldn’t recommend combining different price sources.
You also pull non-price specific data with
extract_bitinfocharts e.g. transactions fees. See
help(Cryptory.extract_bitinfocharts) for more information.
# average daily eth transaction fee my_cryptory.extract_bitinfocharts("eth", metric='transactionfees')
You may have noticed that each method returns a pandas dataframe. In fact, all
cryptory methods return a pandas dataframe. This is convenient, as it allows you to slice and dice the output using common pandas techniques. For example, we can easily merge two
extract_bitinfocharts calls to combine daily bitcoin and ethereum prices.
my_cryptory.extract_bitinfocharts("btc").merge( my_cryptory.extract_bitinfocharts("eth"), on='date', how='inner')
One further source of crypto prices is offered by
extract_poloniex, which pulls data from the public poloniex API. For example, we can retrieve the BTC/ETH exchange rate.
# btc/eth price my_cryptory.extract_poloniex(coin1="btc", coin2="eth")
We’re now in a position to perform some basic analysis of cryptocurrencies prices.
Of course, that graph is meaningless. You can’t just compare the price for single units of each coin. You need to consider the total supply and the market cap. It’s like saying the dollar is undervalued compared to the Japanese Yen. But I probably shouldn’t worry. It’s not as if people are buying cryptos based on them being superficially cheap. More relevant here is the relative change in price since the start of 2017, which we can plot quite easily with a little pandas magic (pct_change).
Those coins are provided on bitinfocharts and they tend to represent older legacy coins. For example, the coin from this list that performed best over 2017 was Reddcoin. It started 2017 with a market cap of less than 1 million dollars, but finished it with a value of around $250m, reaching a peak of over 750m in early Jan 2018. You’ll notice that each coin shows the same general behaviour- a sustained rise between March and June, followed by another spike in December and a noticeable sell-off in Jan 2018.
There’s nothing too surprising (or novel) here. It’s well known that cryptos are heavily correlated- they tend to spike and crash collectively. There’s a few reasons for this: Most importantly, the vast majority of coins can only be exchanged with the few big coins (e.g. btc and eth). As they are priced relative to these big coins, a change in btc or eth will also change the value of those smaller coins. Secondly, it’s not like the stock market. Ethereum and Bitcoin are not as different as, say, Facebook and General Motors. While stock prices are linked to hitting financial targets (i.e. quarterly earnings reports) and wider macroeconomic factors, most cryptos (maybe all) are currently powered by hope and aspirations (well, hype and speculation) around blockchain technology. That’s not to say coins can’t occasionally buck the market e.g. ripple (xrp) in early December. However, overperformance is often followed by market underperformance (e.g. ripple in January 2018).
I’ll admit nothing I’ve presented so far is particularly ground breaking. You could get similar data from the Quandl api (aside: I intend to integrate quandl API calls into
cryptory). The real benefit of
cryptory comes when you want to combine crypto prices with other data sources.
If you’re familiar with cryptos, you’re very likely to be aware of their associated reddit pages. It’s where crypto investors come to discuss the merits of different blockchain implementations, dissect the day’s main talking points and post amusing gifs- okay, it’s mostly just GIFs. With
cryptory you can combine reddit metrics (total number of subscribers, new subscribers, rank -literally scraped from the redditmetrics website) and other crypto data.
Now we can investigate the relationship between price and subreddit growth.
Visually speaking, there’s clearly some correlation between price and subreddit member growth (the y-axis was normalised using the conventional min-max scaling). While the Spearman rank correlation is similarly high for both coins, the Pearson correlation coefficient is significantly stronger for iota, highlighting the importance of not relying on one single measure. At the time of writing, iota and eos both had a marketcap of about $5bn (11th and 9th overall), though the number of subscribers to the iota subreddit was over 3 times more than the eos subreddit (105k and 30k, respectively). While this doesn’t establish whether the relationship between price and reddit is predictive or reactive, it does suggest that reddit metrics could be useful model features for some coins.
You’ll notice an almost simultaneous spike in suscribers to the iota and eos subreddits in late November and early December. This was part of a wider crypto trend, where most coins experienced unprecendented gains. Leading the charge was Bitcoin, which tripled in price between November 15th and December 15th. As the most well known crypto to nocoiners, Bitcoin (and the wider blockchain industry) received considerable mainstream attention during this bull run. Presumably, this attracted quite alot of new crypto investors (i.e gamblers), which propelled the price even higher. Well, what’s the first thing you’re gonna do after reading an article about this fancy futuristic blockchain that’s making people rich?. You’d google bitcoin, ethereum and obviously bitconnect.
cryptory, you can easily combine conventional crypto metrics with Google Trends data. You just need to decide the terms you want to search. It’s basically a small wrapper on top of the pytrends package. If you’ve used Google Trends before, you’ll be aware that you can only retrieve daily scores for max 90 day periods. The
get_google_trends method stitches together overlapping searches, so that you can pull daily scores going back years. It’s probably best to illustrate it with a few examples.
Now we can investigate the relationship between crypto price and google search popularity.
As before, it’s visually obvious and statisically clear that there’s a strong correlation between google searches and coin prices. Again, this a well known observation (here, here and here). What’s not so apparent is whether google search drives or follows the price. That chicken and egg question question will be addressed in my next deep learning post.
A few words on Verge (xvg): eccentric (i.e. crazy) crypto visionary John McAfee recommended (i.e. shilled) the unheralded Verge to his twitter followers (i.e. fools), which triggered a huge surge in its price. As is usually the case with pump and dumps, the pump (from which McAfee himself potentially profitted) was followed by the dump. The sorry story is retold in both the price and google search popularity. Unlike bitcoin and ethereum though, you’d need to consider in your analysis that verge is also a common search term for popular online technology news site The Verge (tron would be a similar case).
Anyway, back to
cryptory, you can supply more than one keyword at a time, allowing you to visualise the relative popularity of different terms. Let’s go back to the early days and compare the historical popularity of Kim Kardashian and Bitcoin since 2013.
According to Google Trends, bitcoin became a more popular search term in June 2017 (a sure sign of a bubble if ever there was one- just realised this isn’t a unique insight either). That said, Bitcoin has never reached the heights of Kim Kardashian on the 13th November 2014 (obviously, the day Kim Kardashian broke the internet). The graph shows daily values, but you’ll notice that it quite closely matches what you’d get for the same weekly search on the Google Trends website.
While social metrics like reddit and google popularity can be powerful tools to study cryptocurrency prices, you may also want to incorporate data related to finance and the wider global economy.
Stock Market Prices
With their market caps and closing prices, cryptocurrencies somewhat resemble traditional company stocks. Of course, the major difference is that you couldn’t possibly pay for a lambo by investing in the stock market. Still, looking at the stock market may provide clues as to how the general economy is performing, or even how specific industries are responding to the blockchain revolution.
cryptory includes a
get_stock_prices method, which scrapes yahoo finance and returns historical daily data. Just note that you’ll need to find the relevant company/index code on the yahoo finance website.
# %5EDJI = Dow Jones my_cryptory.get_stock_prices("%5EDJI")
You may notice the previous closing prices are carried over on days the stock market is closed (e.g. weekends). You can choose to turn off this feature when you initialise your cryptory class (see
With a little help from pandas, we can visualise the performance of bitcoin relative to some specific stocks and indices.
This graph shows the return you would have received if you had invested on January 3rd. As Bitcoin went up the most (>10x returns), it was objectively the best investment. While the inclusion of some names is hopefully intuitive enough, AMD and NVIDIA (and Intel to some extent) are special cases, as these companies produce the graphics cards that underpin the hugely energy intensive (i.e. wasteful) process of crypto mining. Kodak (not to be confused with the pre 2012 bankruptcy Kodak) made the list, as they announced their intention in early Jan 2018 to create their own “photo-centric cryptocurrency” (yes, that’s what caused that blip).
As before, with a little bit of pandas work, you can create a bitcoin stock market correlation plot.
The highest correlation recorded (0.75) is between Google and Nasdaq, which is not surprising, as the former is large component of the latter. As for Bitcoin, it was most correlated with Google (0.12), but its relationship with the stock market was generally quite weak.
While Bitcoin was originally envisioned as alternative system of payments, high transaction fees and rising value has discouraged its use as a legitimate currency. This has meant that Bitcoin and its successors have morphed into an alternative store of value- a sort of easily lost internet gold. So, it may be interesting to investigate the relationship between Bitcoin and the more traditional stores of value.
cryptory includes a
get_metal_prices method that retrieves historical daily prices of various precious metals.
Again, we can easily plot the change in commodity over 2017 and 2018.
Look at silly old gold appreciating slowly over 2017 and 2018, thus representing a stable store of wealth. As before, we can plot a price correlation matrix.
Unsurprisingly, the various precious metals exhibit significant correlation, while bitcoin value appears completely unconnected. I suppose negative correlation could have provided evidence that people are moving away from traditional stores of value, but there’s little evidence to support this theory.
Foreign Exchange Rates
One of the motivations behind Bitcoin was to create a currency that wasn’t controlled by any central authority. There could be no quantitative easing- when the US Central Bank devalued the dollar by essentially printing trillions of new dollars to prop up the faltering economy after the 2007 financial crisis. As such, there may be a relationship between USD exchange rate (which would be devalued by such policies) and money moving into cryptocurrencies.
cryptory includes a
get_exchange_rates method that retrieves historical daily exchange rate between particular currency pairs.
As you can see, the USD has lost ground to the Euro over the last year. We can easily add a few more USD exchange rates (spoiler alert:the USD has depreciated relative to most major currencies). As the results are similar to the precious metals, that code can be found in the Jupyter notebook.
Oil prices are strongly affected by the strength of the global economy (e.g. demand in China) and geopolitical instability (e.g. Middle East, Venezuela). Of course, there’s other factors at play (shale, moves towards renewables, etc.), but you might want to have oil prices in your crypto price model in order to include these forces.
cryptory includes a
get_oil_prices method that retrieves historical daily oil (London Brent Crude) prices.
As you can see, oil is up about 20% since the start of 2017. Of course, you can plot the price over a longer time period.
So what’s the future of cryptos? Moon, obviously! As for the future of
cryptory, it already includes numerous tools that could improve price models (particularly, reddit and google trend metrics). But it’s certainly lacking features that would take it to the moon:
- twitter statistics (specifically John McAffee’s!!!)
- media analysis (number of mainstream articles, sentiment, etc.- example)
- more Asian-centric data sources (Japan and South Korea are said to account for 40% and 20% of global bitcoin volume, respectively)
- more financial/crypto data (integrate Quandl api)
In my next post, I’ll use
cryptory to (hopefully) improve the previous LSTM crypto price prediction model. While you wait for that, you can perform your own cryptocurrency analysis with the accompanying Jupyter notebook. Thanks for reading!