Start Trading Like a Quant: Download Option Chains from Google Finance in R

Algorithmic trading gives you superhuman trading abilities. You can watch hundreds or even thousands of markets, scanning for opportunity without slowing yourself down because the heavy lifting is being done by a computer. All this data and power has a price, however; this usually represents a significant barrier to entry for individual traders and investors who want to start trading quantitatively. Luckily, there is an open source (read: free) tool readily available to all of us which can simplify much of the data processing and analysis that goes into quant trading: a statistical computing package called R.

R is amazing not just because of its liberal licensing and low price: R simply kicks ass when it comes to analyzing financial data. That’s because it has a host of code libraries which have already done a lot of the tedious work for you. For instance, using the quantmod library you can download daily historical stock, forex, and some futures data automatically from either Yahoo or Google Finance.

Unfortunately, I couldn’t find anything like that for downloading and parsing options chains directly to R, so I came up with a script that grabs the intraday options pricing data from Google Finance in JSON format and parses it into an R list. Each element of the list is a chain of options from a different expiration.

If you are new to R you can install it from a R-project mirror here. Click Download R and select the mirror of your choosing.

After that, the next step is to install two R packages which will help us download and parse the options pricing data Google returns:

  • RCurl: we use this to resolve a Google Finance URL which returns JSON-like pricing data
  • jsonlite: a simple JSON parser for R. Transforms raw output to an R data-frame

We can install these with the following code:

# installs RCurl and jsonlite packages. will prompt you to select mirror for download
install.packages("RCurl")
install.packages("jsonlite")

The next step is to run the script, which creates a function getOptionQuote(symbol) which takes any valid symbol which has options quoted on Google Finance:

library(RCurl)
library(jsonlite)

getOptionQuote <- function(symbol){
    output = list()
    url = paste('http://www.google.com/finance/option_chain?q=', symbol, '&output=json', sep = "")
    x = getURL(url)
    fix = fixJSON(x)
    json = fromJSON(fix)
    numExp = dim(json$expirations)[1]
    for(i in 1:numExp){
        # download each expirations data
        y = json$expirations[i,]$y
        m = json$expirations[i,]$m
        d = json$expirations[i,]$d
        expName = paste(y, m, d, sep = "_")
        if (i > 1){
            url = paste('http://www.google.com/finance/option_chain?q=', symbol, '&output=json&expy=', y, '&expm=', m, '&expd=', d, sep = "")
            json = fromJSON(fixJSON(getURL(url)))
        }
        output[[paste(expName, "calls", sep = "_")]] = json$calls
        output[[paste(expName, "puts", sep = "_")]] = json$puts
    }
    return(output)
}

fixJSON <- function(json_str){
    stuff = c('cid','cp','s','cs','vol','expiry','underlying_id','underlying_price',
     'p','c','oi','e','b','strike','a','name','puts','calls','expirations',
     'y','m','d')

     for(i in 1:length(stuff)){
        replacement1 = paste(',"', stuff[i], '":', sep = "")
        replacement2 = paste('\\{"', stuff[i], '":', sep = "")
        regex1 = paste(',', stuff[i], ':', sep = "")
        regex2 = paste('\\{', stuff[i], ':', sep = "")
        json_str = gsub(regex1, replacement1, json_str)
        json_str = gsub(regex2, replacement2, json_str)
     }
     return(json_str)
}

Now we can download whatever options data we want! For example, lets download option chains for the next 26 expirations for AAPL with one line of code then plot the open interest with another:

aapl_opt = getOptionQuote("AAPL")
# this might take a few seconds to complete depending on your connection with Google
# now lets plot the open interest by strike for the 4/17/2015 puts
plot(aapl_opt$"2015_4_17_puts"$strike, aapl_opt$"2015_4_17_puts"$oi, type = "s", main = "Open Interest by Strike")

Which produces the following graph:

AAPL open interest vs strikes

AAPL open interest vs strikes

There is quite a bit of data returned in this function, including the bid/ask for each option, last price, volume, open interest and daily changes. 

Each expiration is a member of the list getOptionQuote(symbol) returns. Calls and puts are in separate dataframes.

19 replies »

  1. thanks for the example. when i am trying to replicate this in rstudio i get an error on 2nd function after this line: replacement2 = paste(‘\{“‘, stuff[i], ‘”:’, sep = “”). the error is: Error: ‘\{‘ is an unrecognized escape in character string starting “‘\{“. could you pls try to explain if it’s an error due to my side? thanks!

    Like

  2. whoops, looks like wordpress markup escaped the double \ to a single, post is updated now it should copy correctly. R needs characters like { escaped. our apologies for not catching it immediately. thanks for the heads up!

    Like

  3. Hello I am trying to run this code in R and get the following

    regex2 = paste(‘\{‘, stuff[i], ‘:’, sep = “”)
    Error: ‘\{‘ is an unrecognized escape in character string starting “‘\{”

    Any thoughts? Thanks for this code.

    Like

  4. hello, have you retried copying the whole script and pasting into R? line should be:

    regex2 = paste(‘\\{‘, stuff[i], ‘:’, sep = “”)

    initially wordpress treated the double \ as an escape character. R needs it to be \\ to escape the { character in the paste function.

    the post should be good to copy and paste now though, I have tried on my R console and it should work now

    also be make sure you have a version of R >= 3.0 as jsonlite won’t work with some older versions, but that error looks related to the initial wordpress escape character. our bad! thanks for your patience

    Like

  5. I’m trying to run the code but I’m getting the following error:

    Error in getOptionQuote(“AAPL”) : could not find function “getURL”

    Any thoughts?

    Like

    • getURL is from the RCurl package, did everything go ok when you ran install.packages(“RCurl”) and library(RCurl) or was there an error? what version of R do you have installed?

      Like

  6. It doesn’t seem to be working:
    > getOptionQuote(“aapl”)
    Error in json$expirations[i, ] : incorrect number of dimensions

    Like

  7. what version of R are you using? just checked this works fine copy/pasted into 3.1.4

    make sure that install.packages and library don’t have errors. RCurl and Jsonlite only work on certain versions

    Like

  8. Hi Mktstk, I’ve never used R but could it be used each day to download the end-of-day options chains data for *all* US options (over 3000) ? Does it run and pull off data fast enough to make this feasible? Thanks!

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s