In this guide, we're going to discuss how to use Python for portfolio optimization.
The following guide is based on notes from this course on Python for Finance and Algorithmic Trading and is organized as follows:
- Sharpe Ratio
- Portfolio Allocation
- Portfolio Statistics
- Portfolio Optimization
In our previous articles on Python for Finance, we've focused on analyzing individual stocks, but we will now shift our focus to the more realistic scenario of managing a portfolio of assets.
If you're interested in learning more about machine learning for trading and investing, check out our AI investment research platform: the MLQ app.
The platform combines fundamentals, alternative data, and ML-based insights for both equities and crypto.
You can learn more about the MLQ app here or sign up for a free account here.
1. Sharpe Ratio
Developed by Nobel Laureate William F. Sharpe, the Sharpe Ratio is a measure for calculating risk-adjusted return and has been the industry standard for such calculations.
The Sharpe Ratio allows us to quantify the relationship the average return earned in excess of the risk-free rate per unit of volatility or total risk.
The formula for the Sharpe ratio is provided below:
$$Sharpe = \frac{R_P - R_f}{\sigma_p}$$
where:
- $R_p$ = portfolio return
- $R_f$ = risk-free rate
- $\sigma_p$ = standard deviation of the portfolio's excess return
Let's look at how we can code use Python for portfolio allocation with the Sharpe ratio.
We'll import Pandas and Quandl, and will grab the adjusted close column for FB, AMZN, AAPL, and IBM for 2018.
import pandas as pd
Let's now get the cumulative return for 2018, which is also known as normalizing a price.
We're going to create a new column in each stock dataframe called Normed Return
:
for stock_df in (fb, amzn, aapl, ibm):
stock_df['Normed Return'] = stock_df['Adj. Close'] /stock_df.iloc[0]['Adj. Close']
To get the normalized return we take the adjusted close column and divide it by the initial price in the period.
Here's what the normalized returns for FB look like:
Stay up to date with AI
2. Portfolio Allocation
Let's now implement a simple portfolio allocation, in this exactly we're only going to go long and will allocate:
- 20% in FB
- 40% in AMZN
- 30% in AAPL
- 10% in IBM
To do this we're going use a for
loop:
- We zip together the previous tuple of stock dataframes
- We pass in a list of the allocation percentages
- Using tuple unpacking we create an Allocation column for our
stock_df
equal to theNormed Return
times the allocation
for stock_df, allo in zip((fb, amzn, aapl, ibm),[.2,.4,.3,.1]):
stock_df['Allocation'] = stock_df['Normed Return']*allo
We now get a better idea of what our returns are portfolio-wise.
Let's look at the value of our position in each stock, assuming we had an initial portfolio value of $1 million.
# value of each position
for stock_df in (fb, amzn, aapl, ibm):
stock_df['Position Value'] = stock_df['Allocation']*1000000
Let's now create a portfolio DataFrame that has all of our position values for the stocks.
To do this we're going to:
- Create a list of all our position values,
- Concatenate them and set
axis=1
- Set the column names
- Add a total portfolio value column
# create list of all position values
all_pos_vals = [fb['Position Value'], amzn['Position Value'], aapl['Position Value'], ibm['Position Value']]
# concatenate the list of position values
portfolio_val = pd.concat(all_pos_vals, axis=1)
# set the column names
portfolio_val.columns = ['FB', 'AMZN', 'AAPL', 'IBM']
# add a total portfolio column
portfolio_val['Total'] = portfolio_val.sum(axis=1)
Now we can see day-by-day how our positions and portfolio value is changing.
Let's now plot out our portfolio - this will show us what the portfolio would have made for the year:
# plot our portfolio
import matplotlib.pyplot as plt
%matplotlib inline
portfolio_val['Total'].plot(figsize=(10,8))
We can see we would have made ~60k or ~6% for the year.
Let's look at how each position performed by dropping the Total column:
portfolio_val.drop('Total',axis=1).plot(figsize=(10,8))
3. Portfolio Statistics
Let's now look at a few statistics of our portfolio, in particular:
- Daily returns
- Average daily return
- Standard deviation
We're then going to use these statistics to calculate our portfolio's Sharpe ratio.
First, let's calculate our daily return:
# Daily Return
portfolio_val['Daily Return'] = portfolio_val['Total'].pct_change(1)
Now let's get our average daily return and standard deviation:
# average daily return
portfolio_val['Daily Return'].mean()
# standard deviation
portfolio_val['Daily Return'].std()
Let's plot a histogram of our daily returns:
# plot histogram of daily returns
portfolio_val['Daily Return'].plot(kind='hist', bins=50, figsize=(4,5))
Let's also calculate the total portfolio return, which is 6.3%:
# cumulative portfolio return
cum_return = 100 * (portfolio_val['Total'][-1]/portfolio_val['Total'][0] - 1)
Sharpe Ratio
As discussed, the Sharpe Ratio is a measure of risk-adjusted returns.
The Sharpe Ratio is the mean (portfolio return - the risk-free rate) % standard deviation.
To keep things simple, we're going to say that the risk-free rate is 0%.
sharpe_ratio = portfolio_val['Daily Return'].mean() / portfolio_val['Daily Return'].std()
In this case, we see the Sharpe Ratio of our Daily Return is 0.078.
Keep in mind this ratio is generally intended to be a yearly measurement, so we're going to multiply this by the square root of 252 to get the annualized Sharpe ratio.
ASR = (252**0.5) * sharpe_ratio
We see the annualized Sharpe Ratio is 1.24.
So what is a good Sharpe Ratio?
Generally, a Sharpe Ratio above 1 is considered acceptable to investors (of course depending on risk-tolerance), a ratio of 2 is very good, and a ratio above 3 is considered to be excellent.
We're now going to look at how we can use the Sharpe Ratio to allocate our portfolio in a more optimal way.
4. Portfolio Optimization
So how can we optimize our portfolio's allocation?
One thing we could do is just check a bunch of random allocations and see which one has the best Sharpe Ratio.
This process of randomly guessing is known as a Monte Carlo Simulation.
What we're going to do is randomly assign a weight to each stock in our portfolio, and then calculate the mean daily return and standard deviation of return.
This allows us to calculate the Sharpe Ratio for many randomly selected allocations.
We're then going to plot the allocations on a chart that displays the return vs. the volatility, colored by the Sharpe Ratio.
What we're looking for is which random allocation has the best Sharpe Ratio.
One thing to note is that guessing and checking is not the most efficient way to optimize a portfolio—instead, we can use math to determine the optimal Sharpe Ratio for a given portfolio.
This is known as an optimization algorithm.
To understand optimization algorithms, we first need to understand the concept of minimization.
Minimization is a similar concept to optimization—let's say we have a simple equation y = x2—the idea is we're trying to figure out what value of $x$ will minimize $y$, in this example 0.
This idea of a minimizer will allow us to build an optimizer.
In our case we're trying to find a portfolio that maximizes the Sharpe Ratio, so we can create an optimizer that attempts to minimize the negative Sharpe Ratio.
In particular, we're going to use SciPy's built-in optimization algorithms to calculate the optimal weight for portfolio allocation, optimized for the Sharpe Ratio.
Let's now code out portfolio optimization, first with a Monte Carlo simulation and then with an optimization algorithm
First, let's read in all of our stocks from Quandl again, and then concatenate them together and rename the columns:
# get just the Adj Close column of FB, AMZN, AAPL, IBM
fb = q.get('WIKI/FB.11', start_date=start, end_date=end)
amzn = q.get('WIKI/AMZN.11', start_date=start, end_date=end)
aapl = q.get('WIKI/AAPL.11', start_date=start, end_date=end)
ibm = q.get('WIKI/IBM.11', start_date=start, end_date=end)
# concatenate them and rename the columns
stocks = pd.concat([fb,amzn,aapl,ibm], axis=1)
stocks.columns = ['fb','amzn','aapl','ibm']
Portfolio Optimization: Monte Carlo Simulation
In order to simulate thousands of possible allocations for our Monte Carlo simulation we'll be using a few statistics, one of which is the mean daily return:
# arithmetic mean daily return
stocks.pct_change(1).mean()
For the rest of this article, we're going to switch to using logarithmic returns instead of arithmetic returns.
The daily return arithmetically would be:
# arithmetic daily return
stocks.pct_change(1).head()
Let's look at how we'd get the logarithmic mean daily return:
# log daily return
log_return = np.log(stocks/stocks.shift(1))
From these, we can see how close the arithmetic and log returns are, but logarithmic returns are a bit more convenient for some analysis techniques.
Before we run thousands of random allocations, let's do a single random allocation. To do this we're going to:
- Set our weights to a random NumPy array
- Rebalance the weights so they add up to one
- Calculate the expected portfolio return
- Calculate the expected portfolio volatility
- Calculate the Sharpe Ratio
print(stocks.columns)
weights = np.array(np.random.random(4))
print('Random Weights:')
print(weights)
print('Rebalance')
weights = weights/np.sum(weights)print(weights)
# expected return
print('Expected Portfolio Return')
exp_ret = np.sum((log_return.mean()*weights)*252)
print(exp_ret)
# expected volatility
print('Expected Volatility')
exp_vol = np.sqrt(np.dot(weights.T,np.dot(log_return.cov()*252, weights)))
print(exp_vol)
# Sharpe Ratio
print('Sharpe Ratio')
SR = exp_ret/exp_vol
print(SR)
Now let's take the above process and repeat it thousands of times.
To do this we're going to:
- Get rid of the print statements
- Set the number of portfolios to simulate - in this case
num_ports = 5000
- Create an array
all_weights
to hold all the weights so we can save them - Create an array to hold all the returns
ret_arr
- Create an array to hold all the volatility measurements
vol_arr
- Create an array of the Sharpe Ratios we calculate
sharpe_arr
- Put the remaining code in a
for
loop
num_ports = 5000
all_weights = np.zeros((num_ports, len(stocks.columns)))
ret_arr = np.zeros(num_ports)
vol_arr = np.zeros(num_ports)
sharpe_arr = np.zeros(num_ports)
for ind in range(num_ports):
# weights
weights = np.array(np.random.random(4))
weights = weights/np.sum(weights)
# save the weights
all_weights[ind,:] = weights
# expected return
ret_arr[ind] = np.sum((log_return.mean()*weights)*252)
# expected volatility
vol_arr[ind] = np.sqrt(np.dot(weights.T,np.dot(log_return.cov()*252, weights)))
# Sharpe Ratio
sharpe_arr[ind] = ret_arr[ind]/vol_arr[ind]
Let's now look at the maximum Sharpe Ratio we got:
If we then get the location of the maximum Sharpe Ratio and then get the allocation for that index. This shows us the optimal allocation out of the 5000 random allocations:
Let's now plot out the data—we're going to use Matplotlib's scatter
functionality and pass in the volatility array, the return array, and color it by the Sharpe Ratio:
# plot the data
plt.figure(figsize=(12,8))
plt.scatter(vol_arr,ret_arr,c=sharpe_arr,cmap='plasma')
plt.colorbar(label='Sharpe Ratio')
plt.xlabel('Volatility')
plt.ylabel('Return')
Let's now put a red dot at the location of the maximum Sharpe Ratio.
To do this we're first going to get the maximum Sharpe Ratio return and the maximum Sharpe Ratio volatility at the optimal allocation index:
max_sr_ret = ret_arr[4988]
max_sr_vol = vol_arr[4988]
Next, we're going to scatter plot these two points:
# plot the dataplt.figure(figsize=(12,8))
plt.scatter(vol_arr,ret_arr,c=sharpe_arr,cmap='plasma')
plt.colorbar(label='Sharpe Ratio')plt.xlabel('Volatility')
plt.ylabel('Return')
# add a red dot for max_sr_vol & max_sr_ret
plt.scatter(max_sr_vol, max_sr_ret, c='red', s=50, edgecolors='black')
Portfolio Optimization: Optimization Algorithm
Let's now move on from random allocations to a mathematical optimization algorithm.
All of the heavy lifting for this optimization will be done with SciPy, so we just have to do a few things to set up the optimization function.
Let's start with a simple function that takes in weights and returns back an array consisting of returns, volatility, and the Sharpe Ratio.
- We define the function as get_ret_vol_sr and pass in weights
- We make sure that weights are a Numpy array
- We calculate the return, volatility, and Sharpe Ratio
- Return an array of return, volatility, and the Sharpe Ratio
def get_ret_vol_sr(weights):
weights = np.array(weights)
ret = np.sum(log_ret.mean() * weights) * 252
vol = np.sqrt(np.dot(weights.T,np.dot(log_ret.cov()*252,weights)))
sr = ret/vol return np.array([ret,vol,sr])
We're then going to import the minimize optimization algorithm from scipy.optimize
:
from scipy.optimize import minimize
To use this function we need to create a few helper functions.
First, we're going to define neg_sharpe
, which takes in weights and returns the second index of our get_ret_vol_sr
function (the Sharpe Ratio).
Recall that we want to minimize the negative Sharpe Ratio so we're going to multiply it by -1.
# minimize negative Sharpe Ratio
def neg_sharpe(weights):
return get_ret_vol_sr(weights)[2] * -1
We're then going to define a function with constraints, as we can help our optimization with constraints—if we have constraints there are fewer things to check.
One of the constraints is called check_sum()
—recall that our allocations need to add up to one. This function is going to return 0 if the sum of the weights is 1, if not it returns how far you are from 1.
# check allocation sums to 1
def check_sum(weights):
return np.sum(weights) - 1
Since we only have one constraint we're going to create a variable called cons, which is a tuple with a dictionary inside of it.
The dictionary takes in a first argument 'type':'eq'
—this says it's going to be an equation type of constraint. The second argument is a function and we pass in the function itself 'fun':check_sum
:
# create constraint variable
cons = ({'type':'eq','fun':check_sum})
We're then going to create a bounds variable—this takes in 4 tuples of the upper and lower bounds for the portfolio allocation weights: 0 and 1.
# create weight boundaries
bounds = ((0,1),(0,1),(0,1),(0,1))
Finally, we need to create an initial guess to start with, and usually, the best initial guess is just an even distribution:
# initial guess
init_guess = [0.25, 0.25, 0.25, 0.25]
Let's now put all of these into the minimization function.
First, we call minimize
and pass in what we're trying to minimize—negative Sharpe, our initial guess, we set the minimization method to SLSQP
, and we set our bounds and constraints:
opt_results = minimize(neg_sharpe, init_guess, method='SLSQP', bounds=bounds, constraints=cons)
The optimal results are stored in the x
array so we call opt_results.x
, and with get_ret_vol_sr(opt_results.x)
we can see the optimal results we can get is a Sharpe Ratio of 3.38.
Since the optimal results of the random allocation were 2.89 we can clearly see the value in optimization algorithms.
Summary: Portfolio Optimization with Python
In this Python for Finance guide, we shifted our focus from analyzing individual stocks to the more realistic scenario of managing a portfolio of assets. In particular, we discussed several key financial concepts, including:
- The Sharpe ratio
- Portfolio allocation
- Portfolio optimization
We also saw how to implement each of these concepts in Python.