Time Series Forecasting Using a Seasonal ARIMA Model: A Python Tutorial

One of the most widely studied models in time series forecasting is the ARIMA (autoregressive integrated moving average) model. Many variations of the ARIMA model exist, which employ similar concepts but with tweaks. One particular example is the seasonal ARIMA (SARIMA) model. The SARIMA model accounts for seasonality when generating time series forecasting models. This tutorial covers the basics of generating and tuning a SARIMA model using Python, with the intent of forecasting a time series with seasonality.

Algorithm Background

First, a little background on how the SARIMA model works. As the ARIMA model makes up the SARIMA model’s backbone, it is beneficial to understand how the ARIMA model works. It comprises the following steps:

  1. Differencing lagged readings in a time series, consequently making the time series stationary.
  2. Using a specified number of lagged observations in a time series to predict future behavior of a time series. This step comprises a combination of two models: the autoregressive (AR) model, and the moving average (MA) model. Some background on each:

AR model. An AR model forecasts a variable using a linear combination of its previous values (1). We describe an AR model with order p as follows:

The autoregressive (AR) model, courtesy of (1)

In the above equation, the epsilon term represents the random error term. The y(t-p) term represents the previous pth value in the time series. The phi value is a constant at each time step. The phi value decreases as it moves further from point t, the latest value in the sequence. Consequently, the closer a lagged value is to the prediction value (at sequence number t), the greater the phi value, and the more influence the time step has on the forecast.

MA model. The moving average model resembles an AR model, except it is a linear combination of previous error terms (2). We describe it as follows:

The moving average (MA) model, courtesy of (3)

In the above equation, the epsilon term is the error term, with epsilon(t) being the error term at time step t, and so on. Like the AR equation above, the phi parameter is a constant value at each time step.Similarly, the phi value decreases as it moves more time steps away from time t.

ARIMA Model Parameters

The ARIMA model includes three main parameters — p, q, and d. The parameters represent the following (4):

  • p: The order of the autoregressive model (the number of lagged terms), described in the AR equation above.
  • q: The order of the moving average model (the number of lagged terms), described in the MA equation above.
  • d: The number of differences required to make the time series stationary.

SARIMA Model Parameters

The SARIMA model builds upon the ARIMA model. It includes the p, q, and d parameters, but also an extra set of parameters to account for time series seasonality. This parameter set–P, Q, D, and additional parameter m–is defined as follows (5):

m: The seasonality of the model. For example, if the seasonality of a time series repeats yearly, then m = 12. If the seasonality is quarterly, then m = 4 (meaning the seasonal pattern repeats once every four months).

P: The order of the seasonal autoregressive model.

Q: The order of the seasonal moving average model.

D: The number of seasonal differences applied to the time series.

The Time Series

For this tutorial, we will use the monthly time series for electricity net generation from geothermal energy in the United States. The data can be pulled directly into Python via the Energy Information Administration’s (EIA) API (see this tutorial for more background on using the EIA’s API). A snapshot of the time series is shown below, where units are in million kilowatt-hours:

Monthly Time Series, Electricity Net Generation from Geothermal Energy. Units are in million kilowatt-hours.

We use the code below to retrieve and visualize our time series:

import eia
import pandas as pd
import matplotlib.pyplot as plt

def retrieve_time_series(api, series_ID):
    Return the time series dataframe, based on API and unique Series ID
    #Retrieve Data By Series ID 
    series_search = api.data_by_series(series=series_ID)
    ##Create a pandas dataframe from the retrieved time series
    df = pd.DataFrame(series_search)
    return df

def plot_data(df, x_variable, y_variable, title):
    Plot the x- and y- variables against each other, where the variables are columns in
    a pandas dataframe
        df: Pandas dataframe. 
        x_variable: String. Name of x-variable column
        y_variable: String. Name of y-variable column
        title: String. Desired title name
    fig, ax = plt.subplots()
                 df[y_variable], marker='', linestyle='-', label=y_variable)

#Create EIA API using your specific API key
api = eia.API(api_key)
#Declare desired series ID
df = retrieve_time_series(api, series_ID)
df.reset_index(level=0, inplace=True)
                   df.columns[1]:'Geothermal_net_generation'}, inplace=True)
#Convert the Date column into a date object
df['Date'] = df['Date'].str.rstrip()
df['Date'] = df['Date'].str.replace(' ', '-')
df['Date']=pd.to_datetime(df['Date'], format='%Y-%m')
#Plot the time series
plot_data(df, 'Date', 
          'Net Generation for Geothermal over Time')

Before we generate a SARIMA model, let’s decompose the time series to ensure it displays seasonality. We use the seasonal_decompose function, available via the statsmodels.tsa package. This function allows us to break the time series down into its trend, seasonal, and residual components:

from statsmodels.tsa.seasonal import seasonal_decompose

def decompose_time_series(series, frequency):
    Decompose a time series and plot it in the console
        series: series. Time series that we want to decompose
        Decomposition plot in the console
    result = seasonal_decompose(series, model='additive', freq = frequency)

#Decompose the time series to determine seasonality/trend
decompose_time_series(df['Geothermal_net_generation'], 12)
Decomposition of the Net Electricity Generation Geothermal Time Series

As you can see in the graphic above, the time series definitely displays seasonality, with a pattern recurrence occurring once every 12 months (yearly).

Building the SARIMA Model

Now that we’ve performed some initial data exploration, let’s build and tune a SARIMA model to forecast the time series.

Hyperparameter selection

We select the p, d, q, P, D, and Q hyperparameters that best fit the SARIMA model largely through brute force–we sample different model hyperparameter variations until we find the best model combination. After generating each model, we use its AIC and BIC scores to gauge its performance. The AIC, or the Akaike information criterion, is a metric that compares the quality of a set of statistic models against one another (6). Similarly, the BIC, or Bayesian Information Criterion, is a Bayesian statistic used to compare different models (7). When comparing models using AIC or BIC metrics, we take the model with the lowest AIC and BIC score as the best option.

In the code snippet below, we generate different hyperparameter variations to run through the SARIMA model, and return a list to loop through:

def sarima_parameter_search(search_range, seasonal = [12]):
    Get all of the parameter combinations for a SARIMA model.
    p = q = d = range(0, search_range)
    trend = ['n','c','t','ct']
    pdq = list(itertools.product(p, d, q))
    pdq_combinations = [(x[0], x[1], x[2], x[3], x[4]) for x in list(itertools.product(p, d, q, seasonal, trend))]
    return pdq, seasonal_pdq_combinations

order_combos, seasonal_order_combos = sarima_parameter_search(search_range = 2)

In the above code snippet, we use the sarima_parameter_search() function to generate a list of different hyperparameter combinations, returned as the seasonal_pqd_combinations and pdq variables. The seasonal variable, set at a default value of 12, indicates that the the seasonal pattern repeats once every 12 months (yearly).

The trend variable has 4 possible options–‘n’, ‘c’, ‘t’, and ‘ct’. This variable controls the polynomial trend of the time series; ‘n’ means no trend (default), ‘c’ means that the trend is constant, ‘t’ indicates a linear trend with time, and ‘ct’ is both constant and linear (8).

The p, d, and q variables can vary between 0 and the search_range limit (exclusive)–in the above code, I set search_range to 2, so the parameter range varies between 0 and 1 (inclusive). This means that the max possible number of lagged terms for the associated AR and MA models is 1, and the max possible order of differencing is 1.

For more information on hyperparameter tuning SARIMA models, check out the following two tutorials, which give a great background on the process:

How to Grid Search SARIMA Hyperparameters for Time Series Forecasting
The Seasonal Autoregressive Integrated Moving Average, or SARIMA, model is an approach for modeling univariate timeā€¦machinelearningmastery.com

How to forecast sales with Python using SARIMA model
A step-by-step guide of statistic and python to time series forecastingtowardsdatascience.com

Training the SARIMA Model

Let’s first split our data into training and test sets. This way, we can build our model using the training set and gauge its performance using test data:

def time_series_train_test_split(time_series, train_split_fraction):
    Split the data into training and test set.
    split_index = int(round(time_series.shape[0]*train_split_fraction, 0))
    train_set = time_series[:split_index]
    test_set = time_series[:-split_index]
    return train_set, test_set

training_set, test_set = time_series_train_test_split(time_series = df['Geothermal_net_generation'], 
                                                      train_split_fraction = .75)

In the above code snippet, we use a standard 75/25 data split (training and test, respectively). We take the first 75 percent of the data points in the time series as the training set, and the last 25 percent of the data points as the test set.

NOTE: It is imperative that we only use one split when generating our training and test sets, as the data MUST maintain its continuity. This is because the SARIMA model depends on the data sequence to make forecasts.

Finally, it’s time to build our SARIMA model. We loop through each of the possible hyperparameter configurations, generating a SARIMA model. We use the AIC parameters for each possible model to gauge model performance. Remember–we want the model with the lowest AIC scores:

def seasonal_arima_model(time_series, order, seasonal_order, trend):
    Generate a seasonal ARIMA model using a set of hyperparameters. Returns the model fit, and the 
    associated model AIC and BIC values.
        model = sm_api.tsa.SARIMAX(time_series, 
                                   trend = trend,
        model_fit = model.fit()
        #Print the model results
        return model_fit, model_fit.aic, model_fit.bic
        print("Could not fit with the designated model parameters")
        return None, None, None

lowest_aic_val = 100000000000
#Generate  model for each of hyperparameter combination in a loop
for order_combo in order_combos:
    for seasonal_order_combo in seasonal_order_combos:
        #Convert the combination to list format
        seasonal_order_combo = list(seasonal_order_combo)
        #Generate the SARIMA model
        model_fit, model_aic, model_bic = seasonal_arima_model(time_series = training_set, 
                                                    order = order_combo, 
                                                    seasonal_order = seasonal_order_combo[0:4],
                                                    trend = seasonal_order_combo[-1])
        #Test model performance, and keep running tab of best performing model
        #Set with the newest value if the lowest_aic_value hasn't yet been calculated (on first run),
        #or if the newly calculated model AIC is lower than the lowest calculated AIC value
        if (model_aic < lowest_aic_val):
            lowest_aic_val = model_aic
            best_model = model_fit
            best_order = order_combo
            best_seasonal_order = seasonal_order_combo
#Print the best model parameters after the 
print("Best model paramaters: order-- ", best_order, ", seasonal order-- ", best_seasonal_order)

In the above snippet of code, we use the seasonal_arima_model() function to build the SARIMA model. If the model is successfully built, then the model summary is printed, and the model and its associated AIC and BIC scores are returned.

We call the seasonal_arima_model() function in a loop, where we sample each of the hyperparameter configurations. We set a default AIC value of 100,000,000,000 (really high), and replace it when we locate a model with a lower AIC score. As we move through the loop, we keep a tab on the model with the lowest associated AIC score, and save that model and its hyperparameter configuration (under the best_model, best_order, and best_seasonal_order variables). We take the SARIMA model stored under best_model at the end of the loop as our final model with the best hyperparameter configuration.

Interpreting the model results

Once the loop is complete, the best model configuration, as well as its summary, is printed in the Python console:

Best SARIMA Model Configuration
SARIMA (1,1,1)X(1,1,1,12) Model Summary

The model summary above contains the ‘Covariance Type’ graph, which depicts each of the variables’ impact on the forecast. We have four main lagged AR and MA variables. The first set of AR and MA variables is lagged by 1 time step (ar.L1 and ma.L1, respectively), and the second set is lagged by 12 time steps (ar.S.L12 and ma.S.L12).

Looking at the ‘P>abs(z)’ term in the graph, all variables read as 0. This is great, as we want our P values to be as close to 0 as possible. Using a cutoff of <0.05 for statistical significance, all of our lagged AR and MA terms significantly impact model forecast.

Testing Model Performance

Now that we have our SARIMA model, let gauge its performance using the test data!

In the code snippet below, we predict the values for the test set, forecasting out the total number of steps that are present in our test set. We then compare the predicted values to the actual values, using root mean squared error (RMSE) and mean absolute error (MAE) metrics. The lower the RMSE and MAE scores, the better the model fit. If the predicted values match the actual values exactly, both the RMSE and MAE equal 0.

For more information on root mean squared error and mean absolute error, check out this link.

def fit_predictions(model_fit, steps_out_to_predict, actual_values):
    This function predicts the SARIMA model out a certain designated number of steps,
    and compares the predictions to the actual values. The root mean squared error and
    the mean absolute error are calculated, comparing the predicted and actual values.
    The function returns the predicted values and their respective confidence intervals.
        model_fit:  SARIMA model.
        steps_out_to_predict: Int. Number of steps out to predict the time series.
        actual_values: Series of actual time series values.
        mean_predicted_values: Series of predicted time series values.
        confidence_interval_predicted_values: Dataframe, containing upper and lower thresholds of the
        confidence interval
    predicted_values = model_fit.get_forecast(steps=steps_out_to_predict)
    mean_predicted_values = predicted_values.predicted_mean
    confidence_interval_predicted_values = predicted_values.conf_int()
    #Compare the actual to the predicted values using RMSE and MAE metrics
    rmse, mae = quantify_rmse_mae(mean_predicted_values, actual_values)
    print("Root mean squared error: ", str(rmse))
    print("Mean absolute error: ", str(mae))
    return mean_predicted_values, confidence_interval_predicted_values
def quantify_rmse_mae(predicted_values, actual_values):
    This function calculates the root mean squared error and mean absolute error for 
    the predicted values, when compared to the actual values. These helps help us to
    gauge model performance. 
        predicted_values: Series of predicted time series values.
        actual_values: Corresponding series of actual time series values.
        rmse: Float. Root mean squared error.
        mae: Float. Mean absolute error.
    #calcuate the mean squared error of the model
    rmse = math.sqrt(mean_squared_error(actual_values, predicted_values))
    #Calculate the mean absolute error of the model 
    mae = mean_absolute_error(actual_values, predicted_values)
    #Return the MSE and MAE for the model
    return rmse, mae
#Run the data on the test set to gauge model performance
mean_predicted_values, confidence_interval_predicted_values = fit_predictions(best_model, 
RMSE and MAE outputs for the forecast

Running the code snippet above, the outputted RMSE and MAE for the model are approximately 868 and 860, respectively.

After calculating the RMSE and MAE metrics, we plot the predicted values against the actual values. This gives us a visual gauge of model performance:

def plot_results(mean_predicted_values, confidence_interval_predicted_values, time_series):
    This function plots actual time series data against SARIMA model-predicted values. 
    We include the confidence interval for the predictions. 
        mean_predicted_values: Series of float values. The model-predicted values.
        confidence_interval_predicted_values: Pandas dataframe, containing the lower and
        upper confidence intervals.
        time_series: Series of float values. Actual time series values that we want to graph
        None. Plot of the time series values, as well as the predicted values and associated 
        confidence interval.
    ax = time_series.plot(label='Observed')
    mean_predicted_values.plot(ax=ax, label = 'Forecast', alpha=.7, figsize=(14, 4))
                    confidence_interval_predicted_values.iloc[:, 0],
                    confidence_interval_predicted_values.iloc[:, 1], color='k', alpha=.2)
    ax.set_xlabel('Date Index')

#Plot the predictions against the real data
Time series forecast vs. actual time series values, at time series index t. Confidence intervals of the prediction are depicted in grey.

Looking at the graphic above, the model does a great job of forecasting out the time series by 140 time steps. This is almost twelve years out! We can see the seasonality of the forecast, which is accounted for by the ar.S.L12 and ma.S.L12 terms in the model.

This concludes my tutorial on generating and forecasting with Seasonal ARIMA models. Check out this Github repo for the full code covered in this tutorial.

Also, check out some of my other time series analysis articles:


  1. https://otexts.com/fpp2/AR.html
  2. https://newonlinecourses.science.psu.edu/stat510/lesson/2/2.1
  3. https://otexts.com/fpp2/MA.html
  4. https://people.duke.edu/~rnau/411arim.htm
  5. https://newonlinecourses.science.psu.edu/stat510/lesson/4/4.1
  6. https://www.statisticshowto.datasciencecentral.com/akaikes-information-criterion/
  7. https://www.statisticshowto.datasciencecentral.com/bayesian-information-criterion/
  8. https://www.statsmodels.org/dev/generated/statsmodels.tsa.statespace.sarimax.SARIMAX.html

One comment

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.