Financial network visualization: clustering by Estimize analyst coverage

Estimize is a crowd-sourced network which collects structured financial predictions submitted by its users, many of whom are professional analysts. Users can make predictions ranging from what next month’s unemployment rate will be to what a stock’s next quarterly earnings report will say. There is a growing body of research that suggests crowd-sourced predictions are more accurate than Wall Street’s traditional analyst consensus.

One of the more interesting uses of the Estimize dataset is financial network analysis. This is a method of clustering that uses the number of Estimize users that cover multiple stocks in common as a measure of similarity. In other words, we define a similarity metric between two stocks that takes the form:

2 * (number of common users between Stock1 and Stock2) / (total number of users in both Stock1 and Stock2)

Intuitively, this returns 1.0 when the same users cover both stocks, and 0.0 when there is no overlap between stocks (no analysts cover both stocks). Thus we can think of this as the correlation of two stocks based on their shared analyst coverage. The closer the user correlation, UserCorr, is to 1.0, the more closely related stock1 is to stock2.

seaborn2Hence, we can make a UserCorr matrix which shows the relationship between a group of stocks akin to a traditional price correlation matrix. Then we can examine the structure of the UserCorr matrix with two familiar visualization tools: heatmaps and minimum spanning trees.

The first figure presents the heatmap of the UserCorr matrix for the component stocks in the Dow 30, minus Traveler’s (TRV) which it looks like is not in the Estimize dataset. This is for the period from 2013 to 2015 Q1. We can see that certain stocks have a great deal of analyst overlap.

We also wanted to see how the relationship evolved over time. To do this, we estimated the UserCorr matrix for each quarter then extracted the MST with the R package igraph.

Want to experiment with the Estimize dataset? You can learn to use the Estimize API by reading our forthcoming eBook, Intro to Social Data for Traders, which is available for pre-order now. The release date is less than a week away on February 26th, 2015 so pre-order now to make sure your get your copy as soon as it’s released.

Throughout each quarter, some stocks consistently form the center of clusters: KO, PFE, and JNJ are some of the most connected stocks. This implies these stocks have the most overlap between analysts.


Interested in resources for Financial Visualization?

Check out the Beta release of SliceMatrix: a unique tool for visualizing the stock market, including views of filtered correlation networks and minimum spanning trees


6 replies »

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s