Simulating correlated random walks with Copulas

The bedrock of modern financial engineering is the random walk. Simulating a single random walk is easy, but what if you want to generate a correlated set of random walks? For example, suppose we want to simulate a simple spread between two correlated assets like two cash equity securities. This can be accomplished by using a mathematical construct called a copula:

Sklar’s Theorem states that any multivariate joint distribution can be written in terms of univariate marginal distribution functions and a copula which describes the dependence structure between the variables.

Copulas are popular in high-dimensional statistical applications as they allow one to easily model and estimate the distribution of random vectors by estimating marginals and copulae separately. There are many parametric copula families available, which usually have parameters that control the strength of dependence

Wikipedia

Today we will focus on the Archimedean family of copulae, specfically the Gumbel copula. The basic idea behind the copula is that you can represent each marginal as a uniform random variable. In general, a copula (C) takes the following form:

C(u_1,u_2,\dots,u_d)=\mathbb{P}[U_1\leq u_1,U_2\leq u_2,\dots,U_d\leq u_d] .

Thus given two uniform random variables we can estimate the probability using the Copula function definition. Specifically, the Gumbel copula takes the bivariate form:

\exp\!\left( -\left( (-\log(u))^\theta + (-\log(v))^\theta \right)^{1/\theta} \right)

For example, suppose we want to simulate correlated random walks for YHOO and FB stocks. Note the Gumbel copula equation has one parameter, theta (u and v are the marginal uniform distributions of YHOO and FB). We can use the python library ambhas to estimate theta for our empirical relationship, which allows us to simulate correlated changes in each stock, as plotted in the figure below:

Simulated correlated minute changes in YHOO and FB. Copulas provide the mathematical underpinning of this simulation method

Simulated correlated minute changes in YHOO and FB. Copulas provide the mathematical underpinning of this simulation method

To start the simulation process, we have to download data for the stocks in question. We can do this by downloading intraday stock data from Google Finance (SEE PREVIOUS POST FOR CODE) and then estimate the copula from the empirical distributions:

import intraday # save google fin download code as intraday.py
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from ambhas.copula import Copula

base = "YHOO"
hedge = "FB"

interval = 60
window = 15

x = intraday.get_google_data(base, interval, window)
y = intraday.get_google_data(hedge, interval, window)

# merge data and convert to log differences
combo = pd.merge(x[['c']], y[['c']], right_index = True, left_index = True)
logs = np.log(combo)
logs = logs.diff().dropna()

gumbel = Copula(logs['c_x'], logs['c_y'], 'gumbel') # estimate the copula param

x_sim,y_sim = gumbel.generate_xy(4800) # this does the simulation

Now x_sim and y_sim are simulated log changes in YHOO and FB, respectively. Thus we can use the exponential function in numpy to convert these changes back into realistic looking prices:

x_sum = x['c'].values[-1] * np.exp(x_sim.cumsum())
y_sum = y['c'].values[-1] * np.exp(y_sim.cumsum())

This uses the last price from the empirical data series as the starting price for each stock in our simulated random walks. Now we can plot the two and get a realistic looking graph:

Simulated correlated random walks from YHOO and FB

Simulated correlated random walks from YHOO and FB

The ambhas library has a number of dependencies you should make sure are installed before using the code above. These dependencies are described here. N. B.: we had to uninstall our current python statistics module and re-install using the instructions hereOtherwise you might get an error like “could not find cpdf in module” or some such. Making sure you have all dependencies installed should resolve this.

Categories: Quantitative Trading

Tagged as: , , , ,

4 replies »

  1. Hi,

    Another way to simulate correlated random variables is to use the Cholesky or Eigenvector decomposition. This may be simpler (both to understand and to implement) than the copula approach.

    And if you want to simulate random walks (from elliptical distribution) with mean and covariance exactly equal to the specified population statistics (even for small samples) Meucci (2009) has an article and Matlab code which shows how you can do this via the Schur decomposition.
    See here: http://www.symmys.com/node/162
    Regards,
    Emlyn

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s