Welcome to Tech Rando! In this tutorial, I will walk you through using the Energy Information Administration’s (EIA) online application programming interface (API) to pull data directly into Python for data analysis.
First, to use the EIA’s API, you’ll need to register on its Open Data page, using the following link: https://www.eia.gov/opendata/.

You will be taken to a registration form that you’ll need to fill out, as shown below:

You will receive an email from the EIA in your inbox containing an API key that you will use to access the data via Python.
Now, you’ll need to install the EIA_python package. Open the Command Prompt or Conda Prompt (depending on where you perform your ‘pip’ installs), and type the following to install the package:
pip install EIA_python
Once the package is installed, you’re ready to access the EIA data via Python!
The EIA_python package cleans up a lot of the nasty data cleaning required when data is pulled directly via the EIA API—mainly, the data is already converted from its initial JSON format and returned in a beautiful Pandas dataframe format.
The EIA offers hundreds of time series options via its API. Each time series has a unique Series ID that is referenced when pulling data into Python. A catalog of available time series (with specific Series IDs) can be accessed via the following webpage: https://www.eia.gov/opendata/qb.php?category=371 .
For this tutorial, we’re going to focus on pulling yearly CO2 emissions in my home state of Texas. The specific series that I’m going to pull is for the total carbon dioxide emissions from all sectors related to natural gas, in the state of Texas (see this site as reference). The Series ID for this time series is EMISS.CO2-TOTV-TT-NG-TX.A.

The Python script for pulling this time series is below (also available via my Github account):
import eia
import pandas as pd
def retrieve_time_series(api, series_ID):
"""
Return the time series dataframe, based on API and unique Series ID
"""
#Retrieve Data By Series ID
series_search = api.data_by_series(series=series_ID)
##Create a pandas dataframe from the retrieved time series
df = pd.DataFrame(series_search)
return df
def main():
"""
Run main script
"""
#Create EIA API using your specific API key
api_key = "YOUR API KEY HERE"
api = eia.API(api_key)
#Declare desired series ID
series_ID='EMISS.CO2-TOTV-TT-NG-TX.A'
df=retrieve_time_series(api, series_ID)
#Print the returned dataframe df
print(df)
if __name__== "__main__":
main()
"""
Total CO2 emissions from all sectors, natural gas, TX (million metric tons CO2)
1980 214.237163
1981 205.069396
1982 177.723591
1983 169.059890
1984 180.060660
1985 178.186725
1986 167.965480
1987 173.925345
1988 185.375988
1989 195.629601
1990 195.024469
1991 191.806929
1992 188.455361
1993 196.532143
1994 193.195241
1995 200.739390
1996 212.532702
1997 210.401856
1998 217.578472
1999 205.526470
2000 227.249651
2001 219.856674
2002 223.234514
2003 209.561714
2004 202.191500
2005 181.055064
2006 180.434067
2007 184.506801
2008 185.778455
2009 177.408633
2010 186.222804
2011 191.900939
2012 200.310049
2013 208.881640
2014 204.718832
2015 215.814051
2016 209.689272
"""
Let’s break down what this simple script means. From the main() block, the api_key (taken from the registration email) and the series_ID variables are declared. An API object is created using the api_key variable. Then, using the retrieve_time_series() function, a Pandas dataframe for the specific series_ID is generated and returned.
As always, thanks for reading! If you’re interested in using other energy data sets in Python, visit some of my other tutorials:
https://techrando.com/2019/06/23/how-to-automate-data-pulls-from-the-online-fracfocus-database/
[…] https://techrando.com/2019/06/26/how-to-use-the-energy-information-administration-eia-application-pr… […]
[…] For more background on setting up EIA API access in Python, check out this tutorial: https://techrando.com/2019/06/26/how-to-use-the-energy-information-administration-eia-application-pr… […]
[…] For more background on setting up EIA API access in Python, check out this tutorial: https://techrando.com/2019/06/26/how-to-use-the-energy-information-administration-eia-application-pr… […]
[…] can be pulled directly into Python via the Energy Information Administration’s (EIA) API (see this tutorial for more background on using the EIA’s API). A snapshot of the time series is shown below, […]