Build Moving Average Process And Model From Scratch With Python

A train in Grand Canyon National Park, AZ. Some rights reserved.

A train in Grand Canyon National Park, AZ. Some rights reserved.

When searching for how to create a moving average process and model, I found there are many tutorials about using ARIMA model of Python Statsmodels library. But there are not many articles discussing the way to solve it using basic Python methods.

This blog discusses a basic way to fit an MA model and explain MA process.

Moving Average process

A moving average is the model using past forecast errors in a regression-like model.

X_{t}=\mu +\varepsilon _{t}+\theta _{1}\varepsilon _+\cdots +\theta _{q}\varepsilon _\,

Create data by simulating the MA(1) process

Assume the mean of data is 50, standard variance is 0.5 and there are 300 elements. The formula used to create the dataset is yt = μ + εt + θ * εt-1. Here μ = 50 and θ = 0.4.

Residuals can be created.

residuals = np.random.normal(0, 0.5, size=300)
df = pd.DataFrame(residuals, columns=['residuals'])
df.head()
 residuals
00.689463
10.623604
21.134742
30.246117
4-0.557622

Create the data that was derived from a simulation of an MA(1) process using the residuals above.

# mean value
mu = 50

df['y'] = mu + df['residuals'] - 0.4 * df['residuals'].shift(1)
df.loc[0, 'y'] = df.loc[0, 'residuals'] + mu

data = df['y'].copy()
data.head()

# output 
# 0    50.689463
# 1    50.347819
# 2    50.885301
# 3    49.792220
# 4    49.343931
# Name: y, dtype: float64

Fit the data to an MA(1) process

Calculate estimated mean

mu_estimated = data.mean()
mu_estimated = data.mean()
mu_estimated
# output
# 50.02046764676418

Mean value μ is estimated and we still need to estimate coefficient θ. We can find the value by minimizing the in sample sum of squared errors.

Before doing this, the value of error at time 0 should be decided. Based on property of MA process, the effects of the initial choice of ε effect the forecasts less and less the further away we move from time 0. We can use any reasonable guess for ε at time 0. Here we set 0 as the value of ε at time 0. (the value of ε0 is not important for the large samples).

To minimize the sum of squared errors, the method Scipy.optimize is used.

# create function to calculate mse
def error_se(coef):
    estimated_errors = []
    for i in range(len(data)):
        if i == 0:
            estimated_errors.append(data[i] - mu_estimated)
        else:
            estimated_errors.append(data[i] - mu_estimated + coef * estimated_errors[i - 1])
    return sum(map(lambda x:x*x,estimated_errors))

# find coefficient θ
initial_guess = 0
result = optimize.minimize(error_mse, initial_guess)
if result.success:
    fitted_params = result.x
    print(fitted_params)
else:
    raise ValueError(result.message)

The output value is 0.37322 which is close to the original coefficient value which is 4. The formula from the result is

yt = 50.02 + εt + 0.373 * εt-1

(original formula is yt = 50 + εt + 0.4 * εt-1)

To verify the result, we can also fit the model using ARIMA method from Statsmodels.

from statsmodels.tsa.arima_model import ARIMA
model = ARIMA(data, order=(0,0,1))
model_fit = model.fit(disp=0)
print(model_fit.summary())
                              ARMA Model Results
==============================================================================
Dep. Variable:                      y   No. Observations:                  300
Model:                     ARMA(0, 1)   Log Likelihood                -214.766
Method:                       css-mle   S.D. of innovations              0.495
Date:                Sat, 03 Nov 2018   AIC                            435.533
Time:                        16:25:57   BIC                            446.644
Sample:                             0   HQIC                           439.980

==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
const         50.0204      0.018   2817.901      0.000      49.986      50.055
ma.L1.y       -0.3801      0.065     -5.865      0.000      -0.507      -0.253
                                    Roots
=============================================================================
                  Real          Imaginary           Modulus         Frequency
-----------------------------------------------------------------------------
MA.1            2.6310           +0.0000j            2.6310            0.0000
-----------------------------------------------------------------------------

The result is pretty close to the results by minimizing of the sum of squared errors.

Forecasting of MA model

To get the forecasted value of yt+1, expected value of εt+1 is used, which is 0.

Forecasted yt+1 = μ + θ * εt and yt+2 = μ and so on. Standard error will increase when error at last time is not 0 and then not change when error at last time is 0. At the same time the confidence interval of predicted value will become wider and then stay consistent in the end.

Forecasts revert quickly to series mean and prediction intervals open as extrapolate.