Main Page > Articles > Correlation Analysis > Correlation Arbitrage: Statistical Pairs Trading with Cointegration

Correlation Arbitrage: Statistical Pairs Trading with Cointegration

From TradingHabits, the trading encyclopedia · 5 min read · March 1, 2026
The Black Book of Day Trading Strategies
Free Book

The Black Book of Day Trading Strategies

1,000 complete strategies · 31 chapters · Full trade plans

Correlation arbitrage capitalizes on the mean-reverting nature of cointegrated asset pairs. This strategy identifies two securities moving together over time. A temporary divergence creates a trading opportunity. The expectation is that the spread between these assets will revert to its historical mean. This strategy suits liquid markets, often equities or ETFs.

Identifying Cointegrated Pairs

Identifying robust cointegrated pairs requires rigorous statistical analysis. Do not confuse correlation with cointegration. Correlation measures the linear relationship between two variables. Cointegration implies a long-term equilibrium relationship. The spread between cointegrated assets is stationary. Use the Augmented Dickey-Fuller (ADF) test or the Engle-Granger test to confirm stationarity of the spread. A p-value below 0.05 suggests stationarity, indicating cointegration. Select highly liquid assets from the same sector or industry. Daily data provides sufficient historical context. Look for pairs with a correlation coefficient above 0.8 over a 252-day lookback period. However, correlation is a preliminary filter, not the sole criterion.

Constructing the Spread

Form the spread as a linear combination of the two assets. The common method involves regressing one asset (Y) against the other (X). The regression equation is Y = α + βX + ε, where ε is the residual. The residual series represents the spread. This spread should be stationary. If not, the pair is not cointegrated. Standardize the spread for easier interpretation. Calculate the mean and standard deviation of the spread over a lookback period, typically 60-120 days. This establishes the historical range for entry and exit signals.

Entry Rules

Entry signals trigger when the spread deviates significantly from its mean. Use standard deviations as thresholds. A common entry rule: short the overperforming asset and long the underperforming asset when the spread moves 2 standard deviations away from its mean. For example, if the spread (Y - βX) moves +2 standard deviations, short Y and long X. If the spread moves -2 standard deviations, long Y and short X. This assumes the spread will revert to its mean. Confirm the deviation with other technical indicators, such as Bollinger Bands applied to the spread. Volume confirmation adds conviction. Look for increasing volume on the divergence day.

Exit Rules

Exit when the spread reverts to its mean. Close both positions when the spread crosses back within 0.5 standard deviations of its historical mean. This captures the bulk of the mean reversion move. Alternatively, set a profit target. For instance, exit when the spread recovers 50% of its initial deviation. A time-based exit also applies. If the spread does not revert within a predetermined period (e.g., 10 trading days), close the positions. This prevents capital from being tied up in non-performing trades.

Risk Management

Position sizing is paramount. Allocate a fixed percentage of capital per trade, typically 0.5% to 1%. Use a stop-loss based on the spread's behavior. A common stop-loss level is 3 standard deviations from the mean. If the spread continues to diverge beyond this point, the cointegration relationship may have broken down. Cut losses swiftly. Monitor the correlation and cointegration of the pair regularly. Re-evaluate the relationship every month. A decaying correlation or non-stationary spread invalidates the strategy. Diversify across multiple pairs to reduce idiosyncratic risk. Do not over-concentrate on a single sector. Limit the number of open pairs trades to manage overall portfolio risk. Ensure sufficient liquidity in both assets to execute trades without significant slippage. Monitor news events affecting either asset or their respective sector. Unexpected news can break cointegration. Adjust positions or exit if fundamental changes occur.

Practical Application

Consider a pair of oil exploration ETFs, XLE and OIH. Over a 252-day period, their correlation is 0.85. A linear regression of OIH on XLE yields a beta of 1.1. The residual series (OIH - 1.1 * XLE) passes the ADF test with a p-value of 0.01. This confirms cointegration. The 60-day mean of the spread is 0.05, with a standard deviation of 0.80. If the spread moves to 1.65 (+2 standard deviations), short OIH and long XLE. If it moves to -1.55 (-2 standard deviations), long OIH and short XLE. Exit when the spread returns to the range of -0.35 to 0.45 (0.5 standard deviations from the mean). Set a stop-loss at 2.45 or -2.35 (3 standard deviations). This systematic approach allows for repeatable execution. Automation helps manage multiple pairs efficiently. Backtest the strategy rigorously on historical data before live deployment. Optimize lookback periods and standard deviation thresholds for specific market conditions. Account for transaction costs and slippage in backtesting. Real-world trading involves these frictions.*