Pulling Financial Time Series Data into Python: Some Free Options

Getting access to financial time series data sets can be a hassle. Fortunately, there are a slew of options available on the internet for pulling financial time series data directly into Python for analysis. Even better, many of these options are free. In this tutorial, we will pull financial time series data into Python using the following free API options:

  • Alpha Vantage
  • Quandl

Between these two API’s, we should be able to gain access to a vast majority of financial data sets, including daily and intraday stock price data.

Pulling Data Using the Alpha Vantage API

Boston, Massachusetts-based Alpha Vantage is a leading provider of free API’s for historical and real-time stock data, physical currency data, and crypto-currency data. Getting a free API key to access its data bank is simple. Go to this webpage, and fill out your contact information as directed:

Fill out your contact details to claim your free API key

Once you’re finished, Alpha Vantage will print an API key on its webpage for your own personal use. You can use this key to pull data directly into Python for analysis.

Downloading Required Libraries

Alpha Vantage has a Python library specifically for its API. Go to the command prompt and enter the following to download Alpha Vantage’s API package:

pip install alpha-vantage

Intraday Time Series Data

There are a couple of options for pulling time series data via Alpha Vantage’s API, depending on the level of data frequency that you want. The first method we will cover is for intraday data, where we want to pull a time series with a data frequency of 1 hour or less.

We use the following code to pull time series data for Google stock, with a data frequency of 15 minutes:

from alpha_vantage.timeseries import TimeSeries
import pandas as pd
import matplotlib.pyplot as plt

alpha_vantage_api_key = "YOUR API KEY HERE"

def pull_intraday_time_series_alpha_vantage(alpha_vantage_api_key, ticker_name, data_interval = '15min'):
    """
    Pull intraday time series data by stock ticker name.
    Args:
        alpha_vantage_api_key: Str. Alpha Vantage API key.
        ticker_name: Str. Ticker name that we want to pull.
        data_interval: String. Desired data interval for the data. Can be '1min', '5min', '15min', '30min', '60min'.
    Outputs:
        data: Dataframe. Time series data, including open, high, low, close, and datetime values.
        metadata: Dataframe. Metadata associated with the time series.   
    """
    #Generate Alpha Vantage time series object
    ts = TimeSeries(key = alpha_vantage_api_key, output_format = 'pandas')
    #Retrieve the data for the past sixty days (outputsize = full)
    data, meta_data = ts.get_intraday(ticker_name, outputsize = 'full', interval= data_interval)
    data['date_time'] = data.index
    return data, meta_data

def plot_data(df, x_variable, y_variable, title):
    """
    Plot the x- and y- variables against each other, where the variables are columns in
    a pandas dataframe
    Args:
        df: Pandas dataframe, containing x_variable and y_variable columns. 
        x_variable: String. Name of x-variable column
        y_variable: String. Name of y-variable column
        title: String. Desired title name in the plot.
    Outputs:
        Plot in the console. 
    """
    fig, ax = plt.subplots()
    ax.plot_date(df[x_variable], 
                 df[y_variable], marker='', linestyle='-', label=y_variable)
    fig.autofmt_xdate()
    plt.title(title)
    plt.show()


#### EXECUTE IN MAIN FUNCTION ####
ts_data, ts_metadata = pull_intraday_time_series_alpha_vantage(alpha_vantage_api_key, ticker_name = "GOOGL")
#Plot the high prices
plot_data(df = ts_data, 
          x_variable = "date_time", 
          y_variable = "2. high", 
          title ="High Values, Google Stock, 15 Minute Data")
Google Stock Price (High), 15 Minute Interval Data
Snapshot of the returned intraday Google stock time series data. Data includes open, high, low, close, and volume information.

In the code snippet above, allowed sampling frequencies include 1 minute, 5 minutes, 15 minutes, 30 minutes, and 60 minutes.

We pull time series data using the pull_intraday_time_series_alpha_vantage() function. We pass our API key, stock ticker name (‘GOOGL’), and the desired sampling frequency in as parameters. The function returns a dataframe containing stock data (including open, high, low, close, and volume data) for the stock at a 15-minute data sampling frequency, as well as a metadata dataframe associated with the time series.

Daily Time Series Data

In addition to intraday data, Alpha Vantage’s API allows you to pull daily time series data. The call method for pulling daily data is similar to the call method for pulling intraday data, as evidenced in the code snippet below:

from alpha_vantage.timeseries import TimeSeries
import pandas as pd
import matplotlib.pyplot as plt

alpha_vantage_api_key = "YOUR API KEY HERE"

def pull_daily_time_series_alpha_vantage(alpha_vantage_api_key, ticker_name, output_size = "compact"):
    """
    Pull daily time series by stock ticker name.
    Args:
        alpha_vantage_api_key: Str. Alpha Vantage API key.
        ticker_name: Str. Ticker name that we want to pull.
        output_size: Str. Can be "full" or "compact". If "compact", then the past 100 days of data
        is returned. If "full" the complete time series is returned (could be 20 years' worth of data!)
    Outputs:
        data: Dataframe. Time series data, including open, high, low, close, and datetime values.
        metadata: Dataframe. Metadata associated with the time series.  
    """
    #Generate Alpha Vantage time series object
    ts = TimeSeries(key = alpha_vantage_api_key, output_format = 'pandas')
    data, meta_data = ts.get_daily_adjusted(ticker_name, outputsize = output_size)
    data['date_time'] = data.index
    return data, meta_data


#### EXECUTE IN MAIN FUNCTION ####
#Pull daily data for Berkshire Hathaway
ts_data, ts_metadata = pull_daily_time_series_alpha_vantage(alpha_vantage_api_key, ticker_name = "BRK.B", output_size = "compact") 
#Plot the high prices
plot_data(df = ts_data, 
          x_variable = "date_time", 
          y_variable = "2. high", 
          title ="High Values, Berkshire Hathaway Stock, Daily Data")
Berkshire Hathaway Stock Price (High), Daily Data
Snapshot of the returned daily Berkshire Hathaway stock data. Data includes open, high, low, close, adjusted close, and volume information.

In the above code block, we pull daily time series data for Berkshire Hathaway stock, going back 100 days. We call the pull_daily_time_series_alpha_vantage() function in the main() block. The function takes our API key, the stock ticker name (in this case, “BRK.B”), and output_size as parameters. The output_size variable relates to how much data we wish to return. The default setting, “compact”, returns the past 100 days of daily data for the stock. If we set output_size to “full”, the complete time series is returned. This can be more than twenty years of daily data!

The examples above are just a brief introduction to Alpha Vantage’s API functionality. For further information on using their API, check out their full API documentation: https://www.alphavantage.co/documentation/

Pulling Data using the Quandl API

Based out of Toronto, Canada, Quandl has over 400,000 users, and provides access to open, commercial, and alternative data sets. Data is provided in an easily digestible format that is great for data analysis.

Alpha Vantage beats Quandl in terms of individual stock data, as Quandl charges for access to most intraday datasets (daily close stock data is free, however). However, Quandl offers a plethora of other data sets for free. A quick scroll through their “free” data set page reveals a treasure trove of free data sets, including:

  1. Wiki Continuous Futures data, which includes continuous contracts for 600 futures. Built on CME, ICE, and LIFFE data.
  2. Zillow Real Estate data, including housing supply and demand data. This data set also includes housing and rent data by size, type, and tier, which can be subset by zip code, neighborhood, city, and state.
  3. Federal Reserve Economic data, which includes data on growth, employment, inflation, labor, and manufacturing in the US. 

For the purpose of this tutorial, we’re going to pull Federal Reserve data via Quandl’s API, as well as daily stock closing data.

Getting Your Quandl API Key

To gain access to your free Quandl API key, sign up for a Quandl account here.

Quandl Sign Up Page

Once you’ve successfully created an account, you should receive an email verification from Quandl to verify your account. After verifying and activating your account, access your profile page, where your API key is clearly displayed:

Quandl Profile Page, which contains your API key

Downloading Required Libraries

Quandl has a specific Python package for handling its API. Go to the command prompt and enter the following to download the Quandl API library:

pip install quandl

PullING Time Series Data

Federal Reserve Economic Data

Before we write any code, let’s check out the different time series sets available under the US Federal Reserve Economic data (FRED) umbrella, via its Quandl documentation page:

Snapshots of some of the time series available in the FRED dataset, available via the FRED documentation page

As you can see in the snapshot above, many time series sets are available for use. For simplicity’s sake, let’s pull the time series for gross domestic product (GDP). In the below code snippet, we pull the quarterly US GDP time series data into Python using the quandl package:

import quandl
import pandas as pd
import matplotlib.pyplot as plt

quandl_api_key = "YOUR API KEY HERE"
#Use the Quandl API to pull data
quandl.ApiConfig.api_key = quandl_api_key
#Pull GDP Data
data = quandl.get('FRED/GDP')
data["date_time"] = data.index
#Plot the GDP time series
plot_data(df = data, 
          x_variable = "date_time", 
          y_variable = "Value", 
          title ="Quarterly GDP Data")
Quarterly GDP Data
Snapshot of returned GDP time series

In the above code, we define our Quandl API key as the quandl.ApiConfig.api_key parameter. We call the GDP data using quandl’s get() function. ‘FRED/GDP’ is passed as the data set name–this is our specific identifier for our time series. We reference a specific data set name first by the master data repository it belongs to–in this case, ‘FRED’–followed by a slash, and then the specific data set name (‘GDP’ here; this value can be found on the master data set’s Documentation page).

End-of-Day Stock Price Data

Although Quandl doesn’t offer free intraday stock price data like Alpha Vantage does, it does provide daily, end-of-day stock price data. We can pull the daily data for Microsoft stock using the following code:

import quandl
import pandas as pd
import matplotlib.pyplot as plt

quandl_api_key = "YOUR API KEY HERE"

#Use the Quandl API to pull data
quandl.ApiConfig.api_key = quandl_api_key
data = quandl.get_table('WIKI/PRICES', ticker = ['MSFT'])
plot_data(df = data, 
          x_variable = "date", 
          y_variable = "open", 
          title ="Daily Microsoft Stock Prices, Open Price")
Daily Microsoft Stock Prices, Opening Price

The above code differs slightly from the previous example, as we use quandl’s get_table() function instead of its get() function. The get_table() function returns a pandas dataframe with multiple columns. In contrast, get() returns a single time series. A snapshot of the data set returned by the get_table() call is displayed below:

Daily Microsoft Stock Data, returned via quandl’s get_table() function

As you can see, the returned Microsoft stock dataframe contains time series data for the stock’s open, high, low, close, volume, and adjusted values.

The Quandl API offers plenty of other functionality than the two examples listed above. For more information on using Quandl’s Python API plugin, check out their documentation in this Github repo.

This concludes my tutorial on using free API’s to pull financial time series data into Python for analysis! For the full code used in this tutorial: check out this Github repo.

Check out some of my other time series analysis tutorials:

One comment

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.