From Theory to Practice: A Practical Implementation of a Genetic Algorithm for Trading in Python
We have explored the theory behind genetic algorithms and their application to trading strategy optimization. Now, it is time to roll up our sleeves and get our hands dirty with some code. In this article, we will provide a step-by-step guide to implementing a simple genetic algorithm for trading in Python. We will use this GA to optimize a moving average crossover strategy for trading Apple stock (AAPL). By the end of this tutorial, you will have a working GA that you can use as a starting point for your own quantitative trading projects.
The Basic Components of a Genetic Algorithm
As a quick refresher, a genetic algorithm has four basic components:
- Population Initialization: We start by creating a population of random individuals. In our case, an individual will be a trading strategy, represented by a set of parameters.
- Selection: We evaluate the fitness of each individual in the population and select the fittest individuals to be the “parents” of the next generation.
- Crossover: We combine the genes of two parents to create a new child. This is the primary way that the GA explores the search space.
- Mutation: We make a small, random change to a gene. This introduces new genetic material into the population and helps to prevent premature convergence.
The Trading Strategy: A Simple Moving Average Crossover
For this tutorial, we will be optimizing a simple moving average (SMA) crossover strategy. This strategy has two parameters:
short_ma: The length of the short-term moving average.long_ma: The length of the long-term moving average.
The trading rules are as follows:
- Buy Signal: When the short-term moving average crosses above the long-term moving average.
- Sell Signal: When the short-term moving average crosses below the long-term moving average.
Our goal is to use a genetic algorithm to find the optimal values for short_ma and long_ma.
The Python Implementation
Now, let's get to the code. We will be using the pandas library for data manipulation and the numpy library for numerical operations. We will also use the yfinance library to download historical stock data.
First, let's install the necessary libraries:
pip install pandas numpy yfinance
pip install pandas numpy yfinance
Now, let's write the Python code for our genetic algorithm. We will break it down into the four main components.
1. Population Initialization
We will start by creating a function to initialize our population. Each individual in the population will be a dictionary with two keys: short_ma and long_ma.
import numpy as np
def initialize_population(population_size, search_space):
population = []
for _ in range(population_size):
individual = {
'short_ma': np.random.randint(search_space['short_ma'][0], search_space['short_ma'][1]),
'long_ma': np.random.randint(search_space['long_ma'][0], search_space['long_ma'][1]),
}
population.append(individual)
return population
import numpy as np
def initialize_population(population_size, search_space):
population = []
for _ in range(population_size):
individual = {
'short_ma': np.random.randint(search_space['short_ma'][0], search_space['short_ma'][1]),
'long_ma': np.random.randint(search_space['long_ma'][0], search_space['long_ma'][1]),
}
population.append(individual)
return population
2. Fitness Evaluation
Next, we need a function to evaluate the fitness of each individual. For this, we will need to backtest the moving average crossover strategy with the given parameters. We will use the Sharpe Ratio as our fitness metric.
import pandas as pd
import yfinance as yf
def evaluate_fitness(individual, data):
short_ma = data['Close'].rolling(window=individual['short_ma']).mean()
long_ma = data['Close'].rolling(window=individual['long_ma']).mean()
signals = pd.DataFrame(index=data.index)
signals['signal'] = 0.0
signals['signal'][individual['short_ma']:] = np.where(short_ma[individual['short_ma']:] > long_ma[individual['short_ma']:], 1.0, 0.0)
signals['positions'] = signals['signal'].diff()
# Backtest
initial_capital = float(100000.0)
positions = pd.DataFrame(index=signals.index).fillna(0.0)
positions['AAPL'] = 100*signals['positions']
portfolio = positions.multiply(data['Close'], axis=0)
pos_diff = positions.diff()
portfolio['holdings'] = (positions.multiply(data['Close'], axis=0)).sum(axis=1)
portfolio['cash'] = initial_capital - (pos_diff.multiply(data['Close'], axis=0)).sum(axis=1).cumsum()
portfolio['total'] = portfolio['cash'] + portfolio['holdings']
portfolio['returns'] = portfolio['total'].pct_change()
# Sharpe Ratio
sharpe_ratio = np.sqrt(252) * (portfolio['returns'].mean() / portfolio['returns'].std())
return sharpe_ratio
import pandas as pd
import yfinance as yf
def evaluate_fitness(individual, data):
short_ma = data['Close'].rolling(window=individual['short_ma']).mean()
long_ma = data['Close'].rolling(window=individual['long_ma']).mean()
signals = pd.DataFrame(index=data.index)
signals['signal'] = 0.0
signals['signal'][individual['short_ma']:] = np.where(short_ma[individual['short_ma']:] > long_ma[individual['short_ma']:], 1.0, 0.0)
signals['positions'] = signals['signal'].diff()
# Backtest
initial_capital = float(100000.0)
positions = pd.DataFrame(index=signals.index).fillna(0.0)
positions['AAPL'] = 100*signals['positions']
portfolio = positions.multiply(data['Close'], axis=0)
pos_diff = positions.diff()
portfolio['holdings'] = (positions.multiply(data['Close'], axis=0)).sum(axis=1)
portfolio['cash'] = initial_capital - (pos_diff.multiply(data['Close'], axis=0)).sum(axis=1).cumsum()
portfolio['total'] = portfolio['cash'] + portfolio['holdings']
portfolio['returns'] = portfolio['total'].pct_change()
# Sharpe Ratio
sharpe_ratio = np.sqrt(252) * (portfolio['returns'].mean() / portfolio['returns'].std())
return sharpe_ratio
3. Selection
For our selection function, we will use a simple tournament selection. In a tournament selection, we randomly select a few individuals from the population and then choose the fittest of those individuals to be a parent.
def selection(population, fitnesses, tournament_size):
selected = []
for _ in range(len(population)):
tournament_indices = np.random.randint(0, len(population), tournament_size)
tournament_fitnesses = [fitnesses[i] for i in tournament_indices]
winner_index = tournament_indices[np.argmax(tournament_fitnesses)]
selected.append(population[winner_index])
return selected
def selection(population, fitnesses, tournament_size):
selected = []
for _ in range(len(population)):
tournament_indices = np.random.randint(0, len(population), tournament_size)
tournament_fitnesses = [fitnesses[i] for i in tournament_indices]
winner_index = tournament_indices[np.argmax(tournament_fitnesses)]
selected.append(population[winner_index])
return selected
4. Crossover and Mutation
Finally, we need our crossover and mutation functions. For crossover, we will use a simple single-point crossover. For mutation, we will randomly change one of the parameters of an individual.
def crossover(parent1, parent2):
child = {}
child['short_ma'] = parent1['short_ma'] if np.random.rand() < 0.5 else parent2['short_ma']
child['long_ma'] = parent1['long_ma'] if np.random.rand() < 0.5 else parent2['long_ma']
return child
def mutation(individual, search_space, mutation_rate):
if np.random.rand() < mutation_rate:
individual['short_ma'] = np.random.randint(search_space['short_ma'][0], search_space['short_ma'][1])
if np.random.rand() < mutation_rate:
individual['long_ma'] = np.random.randint(search_space['long_ma'][0], search_space['long_ma'][1])
return individual
def crossover(parent1, parent2):
child = {}
child['short_ma'] = parent1['short_ma'] if np.random.rand() < 0.5 else parent2['short_ma']
child['long_ma'] = parent1['long_ma'] if np.random.rand() < 0.5 else parent2['long_ma']
return child
def mutation(individual, search_space, mutation_rate):
if np.random.rand() < mutation_rate:
individual['short_ma'] = np.random.randint(search_space['short_ma'][0], search_space['short_ma'][1])
if np.random.rand() < mutation_rate:
individual['long_ma'] = np.random.randint(search_space['long_ma'][0], search_space['long_ma'][1])
return individual
Putting It All Together
Now, we can put all of these components together to create our genetic algorithm.
# Download data
data = yf.download('AAPL', start='2020-01-01', end='2023-01-01')
# GA parameters
population_size = 50
search_space = {'short_ma': [5, 50], 'long_ma': [20, 200]}
num_generations = 20
tournament_size = 5
mutation_rate = 0.1
# Initialize population
population = initialize_population(population_size, search_space)
# Main GA loop
for generation in range(num_generations):
# Evaluate fitness
fitnesses = [evaluate_fitness(ind, data) for ind in population]
# Selection
selected = selection(population, fitnesses, tournament_size)
# Crossover and mutation
next_population = []
for i in range(0, len(selected), 2):
parent1 = selected[i]
parent2 = selected[i+1]
child1 = crossover(parent1, parent2)
child2 = crossover(parent2, parent1)
child1 = mutation(child1, search_space, mutation_rate)
child2 = mutation(child2, search_space, mutation_rate)
next_population.extend([child1, child2])
population = next_population
print(f"Generation {generation+1}: Best Fitness = {np.max(fitnesses)}")
# Get the best individual
best_individual_index = np.argmax(fitnesses)
best_individual = population[best_individual_index]
print(f"\nBest Individual: {best_individual}")
# Download data
data = yf.download('AAPL', start='2020-01-01', end='2023-01-01')
# GA parameters
population_size = 50
search_space = {'short_ma': [5, 50], 'long_ma': [20, 200]}
num_generations = 20
tournament_size = 5
mutation_rate = 0.1
# Initialize population
population = initialize_population(population_size, search_space)
# Main GA loop
for generation in range(num_generations):
# Evaluate fitness
fitnesses = [evaluate_fitness(ind, data) for ind in population]
# Selection
selected = selection(population, fitnesses, tournament_size)
# Crossover and mutation
next_population = []
for i in range(0, len(selected), 2):
parent1 = selected[i]
parent2 = selected[i+1]
child1 = crossover(parent1, parent2)
child2 = crossover(parent2, parent1)
child1 = mutation(child1, search_space, mutation_rate)
child2 = mutation(child2, search_space, mutation_rate)
next_population.extend([child1, child2])
population = next_population
print(f"Generation {generation+1}: Best Fitness = {np.max(fitnesses)}")
# Get the best individual
best_individual_index = np.argmax(fitnesses)
best_individual = population[best_individual_index]
print(f"\nBest Individual: {best_individual}")
Conclusion
This has been a brief but practical introduction to implementing a genetic algorithm for trading in Python. We have built a working GA that can optimize a simple moving average crossover strategy. This is just the beginning, of course. There are many ways that this GA could be improved, such as by adding more complex trading rules, using a more sophisticated fitness function, or implementing more advanced genetic operators. However, this tutorial should give you a solid foundation to build upon. The world of quantitative trading is a challenging but rewarding one, and genetic algorithms are a effective tool to have in your arsenal.
