Published in ·
--
Subscribe to our Telegram channel for more insights into the world of trading.”
The article describes a brief introduction to pairs trading including concept, basic math, strategy algorithm, trading robot development, backtesting and forwarding tests evaluation, and future problems discussion. As a practical example, the robot will trade on cryptocurrencies.
Pairs trading is a market-neutral trading strategy that employs a long position with a short position in a pair of highly co-moved assets.
The strategy’s profit is derived from the difference in price change between the two instruments, rather than from the direction each moves. Therefore, a profit can be realized if the long position goes up more than the short, or the short position goes down more than the long (in a perfect situation, the long position rises and the short position falls, but that’s not a requirement for making a profit). It’s possible for pairs traders to profit during a variety of market conditions, including periods when the market goes up, down or sideways — and during periods of either low or high volatility.
Source: Investopedia
In quantitative trading, we usually work with non-stationary time-series. Often, people consider correlated for two assets when these assets co-move, but this term is mathematically incorrect in this context. Pearson’s correlation is defined for stationary variables only. As we see, this formula uses expected values and standard deviations, but these values are changing over time in non-stationary processes.
For these processes, we can define the cointegration. Cointegration refers to some stationary linear combination of several non-stationary time-series. Easy explanation you can find in this video
This picture shows two processes (X and Y), and their spread. This is an example of the correlation with no cointegration.
This example is vice versa (cointegration with no correlation)
How to build these processes using Python you can find here.
For going to the next chapter, we should know how to detect the cointegration.
The three main methods for testing for cointegration are:
Engle–Granger two-step method
If xt and yt are non-stationary and cointegrated, then a linear combination of them must be stationary. In other words:
See AlsoWhat are Crypto Trading Pairs?yt−βxt =ut, where ut is stationary.
If we knew ut, we could just test it for stationarity with something like a Dickey–Fuller test, Phillips–Perron test and be done. But because we don’t know ut, we must estimate this first, generally by using ordinary least squares, and then run our stationarity test on the estimated ut series.
2. Johansen test
The Johansen test is a test for cointegration that allows for more than one cointegrating relationship, unlike the Engle–Granger method, but this test is subject to asymptotic properties, i.e. large samples. If the sample size is too small then the results will not be reliable and one should use Auto Regressive Distributed Lags (ARDL).
3. Phillips–Ouliaris cointegration test
Peter C. B. Phillips and Sam Ouliaris (1990) show that residual-based unit root tests applied to the estimated cointegrating residuals do not have the usual Dickey–Fuller distributions under the null hypothesis of no-cointegration. Because of the spurious regression phenomenon under the null hypothesis, the distribution of these tests have asymptotic distributions that depend on (1) the number of deterministic trend terms and (2) the number of variables with which co-integration is being tested. These distributions are known as Phillips–Ouliaris distributions and critical values have been tabulated. In finite samples, a superior alternative to the use of these asymptotic critical value is to generate critical values from simulations.
Source: Wikipedia
Let’s code some analysis for this problem. First of all, download the data from Bitfinex for several cryptocurrencies (from 2018–01–01 to 2018–05–31). The next step is plotting a performance of cryptocurrencies. Finally, carry out the cointegration test for all pairs of assets.
The performance of cryptocurrencies is
The null-hypothesis is that there is no cointegration, the alternative hypothesis is that there is cointegrating relationship. If the p-value is small, below a critical size, then we can reject the hypothesis that there is no cointegrating relationship.
We can conclude that some of these pairs are cointegrated and could be selected for the next research.
There is no single approach in pairs trading how to calculate the spread and trade this. Some of the approaches use a linear regression and residuals as a spread. We will use the next algorithm.
The algorithmic strategy contains these steps:
- Identify the cointegrated pairs by one of the methods described above (e.g. Engle-Granger). This step should be performed periodically for getting a pair (or several pairs) that will be used in the next steps.
- Get the price history of assets by length N. Calculate the returns of each asset (e.g. A and B) in the pair
3. Calculate the difference between returns
4. Calculate the z-score, z-score is the number of standard deviations from the mean a data point is.
This picture illustrates z-score
5. Check enter position rule:
Open the long position for A (50% of capital) and the short position for B (50% of capital) if this condition is true
Open the short position for A and the long position for B if this condition is true
6. Check close position rule:
Close all positions if this condition is true
Let’s code this algorithm using Catalyst framework. I provided a quick introduction to Catalyst in my previous article. The information about initialize, handle_data, analyze, and run_algorithm functions you find there.
A standard approach is using a train \ test split, but we also have a cointegration test period in our case. These periods should be not intersected. Therefore, we have
Cointegration test period — 5 months (from 2018–01–01 to 2018–05–31)
Backtesting period — 4 months (from 2018–06–01 to 2018–9–30)
Forwarding period — 2 months (from 2018–10–1 to 2018–11–30)
First of all, we should validate the algorithm. Let’s run this script using XMR/USD and NEO/USD pair and disabling commission costs and turn off slippage model.
As we see, the algorithm return curve is pretty good. It looks like how it should work (very high Sortino ratio and return is 164% for 4 months). Console outputs the performance:
Total return: 1.6415993234216582
Sortino coef: 30.971434947620118
Max drawdown: -0.05125165292172551
Let’s set up the commission costs and slippage model
The performance is poor, and the equity (red line) is smoothly decreasing. Usually, it happens when a strategy generates a lot of signals with a low value of average profit. Console outputs the performance:
Total return: -0.9160713719222552
Sortino coef: -11.718587056499238
Max drawdown: -0.914893278444377
We should try to reduce the number of trade signals, also a potential profit of deals should be high. I suggest to increase the min_spread value, set to 0.035, it means that spread should be higher than round-trip transaction costs by a few times. Also, z_signal_in value should be higher, e.g. for 99.99% interval. Timeframe could be changed to a larger value (e.g. hourly), but the period of analysis will be the same (3 days).
This set of parameters achieve our goal. The number of signals is low (yellow color line represents the used leverage), and the algorithm has a positive performance for 4 months:
Total return: 0.0946758967277288
Sortino coef: 8.399998343300492
Max drawdown: -0.028181546269574607
This step shows more real picture of the developed algorithm. Let’s run the strategy on out-of-sample data (last 2 months).
The performance is still good, metrics are close to backtesting values:
Total return: 0.040754467244888515
Sortino coef: 8.205062447014148
Max drawdown: -0.010029904921808908
We can compare the results by Sortino ratio value.
An equity chart of the strategy is
- Carry out a lot of experiments with different assets to create a reliable portfolio of assets and tune a money-management between them. It will allow to get more significant statistics, because the number of transactions will be greater.
- Experiment with cross currency pairs to reduce transactions costs (e.g. XMR/NEO instead of XMR/USD and NEO/USD).
- Adapt the following steps cointegration test — backtesting — forwarding for each pair in a portfolio to get more reliable performance in production mode. Parameters for tuning: length of history, p-value threshold, and algorithm’s parameters.
- Create the rules that stop the algorithm when co-movement property is broken. If this is not foreseen the result can be a disaster.
- Described the approach and created the algorithmic trading strategy.
- The algorithm has a positive result on the backtesting and forwarding tests. Demonstrated the different metrics and graphs of performance.
- Suggested an advice on how to improve this research.
- Source code you can get on github.
Best regards,