- Practical Time Series Analysis
- Dr. Avishek Pal Dr. PKS Prakash
- 680字
- 2021-07-08 10:18:26
General trend
When a time series exhibits an upward or downward movement in the long run, it is said to have a general trend. A quick way to check the presence of general trend is to plot the time series as in the following figure, which shows CO2 concentrations in air measured during 1974 through 1987:

However, general trend might not be evident over a short run of the series. Short run effects such as seasonal fluctuations and irregular variations cause the time series to revisit lower or higher values observed in the past and hence can temporarily obfuscate any general trend. This is evident in the same time series of CO2 concentrations when zoomed in over the period of 1979 through 1981, as shown in the following figure. Hence to reveal general trend, we need a time series that dates substantially back in the past.

The general trend in the time series is due to fundamental shifts or systemic changes of the process or system it represents. For example, the upward movement of CO2 concentrations during 1974 through 1987 can be attributed to the gradual rise in automobiles and industrialization over these years.
A general trend is commonly modeled by setting up the time series as a regression against time and other known factors as explanatory variables. The regression or trend line can then be used as a prediction of the long run movement of the time series. Residuals left by the trend line is further analyzed for other interesting properties such as seasonality, cyclical behavior, and irregular variations.
Now, let us go through the code that generated the preceding plots on CO2 concentrations. We will also show how to build a trend model using linear regression on the time index (which in this case is the index of the year in the data) as explanatory variable and the CO2 concentration as the dependent variable. But first, let us load the data in a pandas.DataFrame.
We start by importing the required packages as follows:
from __future__ import print_function import os import pandas as pd import numpy as np %matplotlib inline from matplotlib import pyplot as plt import seaborn as sns os.chdir('D:\Practical Time Series') data = pd.read_excel('datasets/Monthly_CO2_Concentrations.xlsx', converters={'Year': np.int32, 'Month': np.int32}) data.head()
We have passed the argument converters to the read_excel function in order to make sure that columns Year and Month are assigned the integer (np.int32) datatype. The preceding lines of code will generate the following table:

Before plotting we must remove all columns having missing values. Besides, the DataFrame is sorted in ascending order of Year and Month. These are done as follows:
data = data.ix[(~pd.isnull(data['CO2']))&\ (~pd.isnull(data['Year']))&\ (~pd.isnull(data['Month']))] data.sort_values(['Year', 'Month'], inplace=True)
Finally, the plot for the time period 1974 to 1987 is generated by executing the following lines:
plt.figure(figsize=(5.5, 5.5))
data['CO2'].plot(color='b')
plt.title('Monthly CO2 concentrations')
plt.xlabel('Time')
plt.ylabel('CO2 concentratition')
plt.xticks(rotation=30)
The zoomed-in version of the data for the time period 1980 to 1981 is generated by after the DataFrame for these three years:
plt.figure(figsize=(5.5, 5.5))
data['CO2'].loc[(data['Year']==1980) | (data['Year']==1981)].plot(color='b')
plt.title('Monthly CO2 concentrations')
plt.xlabel('Time')
plt.ylabel('CO2 concentratition')
plt.xticks(rotation=30)
Next, let us fit the trend line. For this we import the LinearRegression class from scikit-learn and fit a linear model on the time index:
from sklearn.linear_model import LinearRegression trend_model = LinearRegression(normalize=True, fit_intercept=True) trend_model.fit(np.array(data.index).reshape((-1,1)), data['CO2']) print('Trend model coefficient={} and intercept={}'.format(trend_model.coef_[0], trend_model.intercept_) )
This produces the following output:
Trend model coefficient=0.111822078545 and intercept=329.455422234
The residuals obtained from the trend line model are shown in the following figure and appear to have seasonal behaviour, which is discussed in the next sub section.
The residuals are calculated and plotted in the preceding by the following lines of code:
residuals = np.array(data['CO2']) - trend_model.predict(np.array(data.index).reshape((-1,1)))
plt.figure(figsize=(5.5, 5.5))
pd.Series(data=residuals, index=data.index).plot(color='b')
plt.title('Residuals of trend model for CO2 concentrations')
plt.xlabel('Time')
plt.ylabel('CO2 concentratition')
plt.xticks(rotation=30)

- Getting Started with Citrix XenApp? 7.6
- 數(shù)據(jù)庫系統(tǒng)原理及MySQL應(yīng)用教程
- Windows Server 2012 Unified Remote Access Planning and Deployment
- Mastering RStudio:Develop,Communicate,and Collaborate with R
- Java 9模塊化開發(fā):核心原則與實踐
- SQL Server與JSP動態(tài)網(wǎng)站開發(fā)
- Building Android UIs with Custom Views
- Django 3.0入門與實踐
- Scala編程實戰(zhàn)
- Visual Basic 程序設(shè)計實踐教程
- 3D Printing Designs:Octopus Pencil Holder
- 微信公眾平臺開發(fā)最佳實踐
- Scratch少兒編程高手的7個好習(xí)慣
- Python數(shù)據(jù)分析與挖掘?qū)崙?zhàn)(第2版)
- 自己動手做智能產(chǎn)品:嵌入式JavaScript實現(xiàn)