Build Moving Average Process And Model From Scratch With Python
A train in Grand Canyon National Park, AZ. Some rights reserved.
When searching for how to create a moving average process and model, I found there are many tutorials about using ARIMA model of Python Statsmodels library. But there are not many articles discussing the way to solve it using basic Python methods.
This blog discusses a basic way to fit an MA model and explain MA process.
Moving Average process
A moving average is the model using past forecast errors in a regression-like model.
Create data by simulating the MA(1) process
Assume the mean of data is 50, standard variance is 0.5 and there are 300 elements. The formula used to create the dataset is yt = μ + εt + θ * εt-1. Here μ = 50 and θ = 0.4.
Residuals can be created.
residuals = np.random.normal(0, 0.5, size=300)
df = pd.DataFrame(residuals, columns=['residuals'])
df.head()
residuals | |
---|---|
0 | 0.689463 |
1 | 0.623604 |
2 | 1.134742 |
3 | 0.246117 |
4 | -0.557622 |
Create the data that was derived from a simulation of an MA(1) process using the residuals above.
# mean value
mu = 50
df['y'] = mu + df['residuals'] - 0.4 * df['residuals'].shift(1)
df.loc[0, 'y'] = df.loc[0, 'residuals'] + mu
data = df['y'].copy()
data.head()
# output
# 0 50.689463
# 1 50.347819
# 2 50.885301
# 3 49.792220
# 4 49.343931
# Name: y, dtype: float64
Fit the data to an MA(1) process
Calculate estimated mean
mu_estimated = data.mean()
mu_estimated = data.mean()
mu_estimated
# output
# 50.02046764676418
Mean value μ is estimated and we still need to estimate coefficient θ. We can find the value by minimizing the in sample sum of squared errors.
Before doing this, the value of error at time 0 should be decided. Based on property of MA process, the effects of the initial choice of ε effect the forecasts less and less the further away we move from time 0. We can use any reasonable guess for ε at time 0. Here we set 0 as the value of ε at time 0. (the value of ε0 is not important for the large samples).
To minimize the sum of squared errors, the method Scipy.optimize is used.
# create function to calculate mse
def error_se(coef):
estimated_errors = []
for i in range(len(data)):
if i == 0:
estimated_errors.append(data[i] - mu_estimated)
else:
estimated_errors.append(data[i] - mu_estimated + coef * estimated_errors[i - 1])
return sum(map(lambda x:x*x,estimated_errors))
# find coefficient θ
initial_guess = 0
result = optimize.minimize(error_mse, initial_guess)
if result.success:
fitted_params = result.x
print(fitted_params)
else:
raise ValueError(result.message)
The output value is 0.37322 which is close to the original coefficient value which is 4. The formula from the result is
yt = 50.02 + εt + 0.373 * εt-1
(original formula is yt = 50 + εt + 0.4 * εt-1)
To verify the result, we can also fit the model using ARIMA method from Statsmodels.
from statsmodels.tsa.arima_model import ARIMA
model = ARIMA(data, order=(0,0,1))
model_fit = model.fit(disp=0)
print(model_fit.summary())
ARMA Model Results
==============================================================================
Dep. Variable: y No. Observations: 300
Model: ARMA(0, 1) Log Likelihood -214.766
Method: css-mle S.D. of innovations 0.495
Date: Sat, 03 Nov 2018 AIC 435.533
Time: 16:25:57 BIC 446.644
Sample: 0 HQIC 439.980
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
const 50.0204 0.018 2817.901 0.000 49.986 50.055
ma.L1.y -0.3801 0.065 -5.865 0.000 -0.507 -0.253
Roots
=============================================================================
Real Imaginary Modulus Frequency
-----------------------------------------------------------------------------
MA.1 2.6310 +0.0000j 2.6310 0.0000
-----------------------------------------------------------------------------
The result is pretty close to the results by minimizing of the sum of squared errors.
Forecasting of MA model
To get the forecasted value of yt+1, expected value of εt+1 is used, which is 0.
Forecasted yt+1 = μ + θ * εt and yt+2 = μ and so on. Standard error will increase when error at last time is not 0 and then not change when error at last time is 0. At the same time the confidence interval of predicted value will become wider and then stay consistent in the end.
Forecasts revert quickly to series mean and prediction intervals open as extrapolate.