Statistical Arbitrage: A Cointegration Approach to Pairs Trading
Correlation is Not Cointegration: The Achilles’ Heel of Naive Pairs Trading
The simplest form of pairs trading involves finding two stocks whose prices have historically moved together and then betting that this relationship will continue in the future. The trader will calculate the correlation between the two stocks and if it is high, they will consider them to be a pair. When the spread between the two stocks widens, the trader will short the outperforming stock and buy the underperforming stock. When the spread narrows, the trader will close the position for a profit. This is a simple and intuitive strategy, but it is also deeply flawed. The problem is that correlation is a spurious measure of a long-term relationship. Two stocks can be highly correlated for a period of time, but there is no guarantee that this relationship will persist. If the underlying economic relationship between the two companies changes, the correlation can break down and the pairs trade will fail.
A more robust approach to pairs trading is to use the concept of cointegration. Cointegration is a statistical property of two or more time series which indicates that they have a long-run equilibrium relationship. Even though the individual series may be non-stationary (i.e., they have a random walk component), a linear combination of them is stationary. This stationary linear combination is the ‘spread’. A cointegrated pair of stocks will always revert to its long-run equilibrium. This is a much more reliable basis for a pairs trade than simple correlation.
Testing for Cointegration: The Engle-Granger Two-Step Method
So how do we test for cointegration? The most common method is the Engle-Granger two-step method:
-
Test for Stationarity: The first step is to test whether the individual stock price series are non-stationary. This is done using a unit root test, such as the Augmented Dickey-Fuller (ADF) test. The null hypothesis of the ADF test is that the series has a unit root (i.e., it is non-stationary). If we can reject the null hypothesis, then the series is stationary.
-
Regress and Test the Residuals: If both stock price series are found to be non-stationary, the next step is to run a linear regression of one stock’s price on the other’s. The equation is:
Y_t = beta * X_t + e_tWhere
Y_tandX_tare the prices of the two stocks at timet,betais the hedge ratio, ande_tis the residual. The residual represents the spread between the two stocks. The final step is to test whether the residual series is stationary using the ADF test. If the residual series is stationary, then the two stocks are cointegrated.*
Building a Cointegrated Pairs Trade
Once a cointegrated pair has been identified, the next step is to build the trade. The trade is constructed by going long one stock and short the other, with a hedge ratio equal to the beta from the cointegration regression. The spread is then calculated as:
Spread_t = Y_t - beta * X_t*
The trader will then monitor the spread and look for opportunities to enter a trade. A common strategy is to enter a trade when the spread deviates from its mean by a certain number of standard deviations. For example, the trader might go long the spread (buy Y and sell X) when the spread is two standard deviations below its mean, and go short the spread (sell Y and buy X) when the spread is two standard deviations above its mean. The trade is closed when the spread reverts to its mean.
The Risks of Cointegration-Based Pairs Trading
While a cointegration-based approach to pairs trading is more robust than a correlation-based approach, it is not without its risks. The main risks are:
-
The Relationship Breaks Down: The cointegrating relationship between two stocks is not guaranteed to last forever. There could be a structural break in the relationship due to a merger, a new technology, or a change in the competitive landscape. If the relationship breaks down, the spread may not revert to its mean and the trade will lose money.
-
The Half-Life of the Spread: The speed at which the spread reverts to its mean is known as its half-life. A shorter half-life is better, as it means that the trade will be profitable more quickly and the trader will be exposed to less risk. The half-life can be estimated from the cointegration regression. If the half-life is too long, the trade may not be worth taking.
-
Execution Risk: Pairs trading requires the simultaneous execution of two trades. This can be difficult to do, especially in a fast-moving market. There is a risk that the trader will get a bad fill on one or both legs of the trade, which will eat into the profits.
Pairs trading is a classic statistical arbitrage strategy that has been used by hedge funds for decades. While it is not a risk-free strategy, a rigorous, cointegration-based approach can provide a solid foundation for building a profitable and scalable trading strategy.
