Time Series Cheat Sheet - Math Department

2
Plot Time Series Time Series Cheat Sheet RStudio® is a trademark of RStudio, Inc. • CC BY SA Yunjun Xia, Shuyu Huang • [email protected], [email protected] • Updated: 2019-10 Auto-correlation Filters Parameter Estimation Forecasting 1. tsplot(x=time, y=data) 2. plot(ts(data, start=start_time, frequency=gap)) 3. ts.plot(ts(data, start=start_time, frequency=gap)) Simulation X t = ϕ 1 X t1 + ϕ 2 X t2 +…+ ϕ p X tp + W t Autoregression of Order p Moving Average of Order q ARMA (p, q) X t = Z t + θ 1 Z t1 + θ 2 Z t2 +…+ θ q Z tp X t = ϕ 1 X t1 + ϕ 2 X t2 +…+ ϕ p X tp + Z t + θ 1 Z t1 + θ 2 Z t2 +…+ θ q Z tp arima.sim(model=list(ar=c( ), ma=c( )), n=n) Simulation of ARMA (p, q) ϕ 1 ,..., ϕ p θ 1 ,..., θ q Linear Filter: filter() Differencing Filter: diff() filter(data, filter=filter_coefficients, sides=2, method="convolution", circular=F) diff(data, lag=4, differences=1) Use ACF and PACF to detect model (Complete) Auto-correlation function: acf() acf(data, type=‘correlation’, na.action=na.pass) Partial Auto-correlation function: pacf() OR: acf(data, type=‘partial’, na.action=na.pass) pacf(data, na.action=na.pass) ar(): To estimate parameters of an AR model ar(x=data, aic=T, order.max = NULL, c("yule-walker", "burg", "ols", "mle", "yw")) arima(): To estimate parameters of an AM or ARMA model, and build model arima(data, order=c(p, 0, q),method=c(‘ML’)) AICc(): Compare models using AICC AICc(fittedModel) Fit an ARMA time series model to the data predict(arima_model, number_to_predict) Forecasting future observations given a fitted ARMA model predict(): Predict future observations given a fitted ARMA model Plot Predicted values and Confidence Interval: fit<-predict(arima_model, number_to_predict) ts.plot(data, xlim=c(1, length(data)+number_to_predict), ylim=c(0, max(fit$pred+1.96*fit$se))) lines(length(data)+1:length(data)+ number_to_predict, fit$pred) OR: autoplot(forecast(arima_model, level=c(95), h=number_to_predict))

Transcript of Time Series Cheat Sheet - Math Department

Page 1: Time Series Cheat Sheet - Math Department

Plot Time Series

Time Series Cheat Sheet

RStudio® is a trademark of RStudio, Inc. • CC BY SA Yunjun Xia, Shuyu Huang • [email protected], [email protected] • Updated: 2019-10

Auto-correlation

Filters

Parameter Estimation

Forecasting1. tsplot(x=time, y=data)

2. plot(ts(data, start=start_time, frequency=gap))

3. ts.plot(ts(data, start=start_time, frequency=gap))

Simulation

Xt = ϕ1Xt−1 + ϕ2Xt−2 + … + ϕpXt−p + Wt

Autoregression of Order p

Moving Average of Order q

ARMA (p, q)

Xt = Zt + θ1Zt−1 + θ2Zt−2 + … + θqZt−p

Xt = ϕ1Xt−1 + ϕ2Xt−2 + … + ϕpXt−p+Zt + θ1Zt−1 + θ2Zt−2 + … + θqZt−p

arima.sim(model=list(ar=c( ), ma=c( )), n=n)

Simulation of ARMA (p, q)

ϕ1, . . . , ϕpθ1, . . . , θq

Linear Filter: filter()

Differencing Filter: diff()

filter(data, filter=filter_coefficients, sides=2, method="convolution", circular=F)

diff(data, lag=4, differences=1)

Use ACF and PACF to detect model

(Complete) Auto-correlation function: acf()

acf(data, type=‘correlation’, na.action=na.pass)

Partial Auto-correlation function: pacf()

OR: acf(data, type=‘partial’, na.action=na.pass)

pacf(data, na.action=na.pass)

ar(): To estimate parameters of an AR model

ar(x=data, aic=T, order.max = NULL,

c("yule-walker", "burg", "ols", "mle", "yw"))

arima(): To estimate parameters of an AM or ARMA model, and build model

arima(data, order=c(p, 0, q),method=c(‘ML’))

AICc(): Compare models using AICC

AICc(fittedModel)

Fit an ARMA time series model to the data

predict(arima_model, number_to_predict)

Forecasting future observations given a fitted ARMA model

predict(): Predict future observations given a fitted ARMA model

Plot Predicted values and Confidence Interval:

fit<-predict(arima_model, number_to_predict)

ts.plot(data,

xlim=c(1, length(data)+number_to_predict),

ylim=c(0, max(fit$pred+1.96*fit$se)))

lines(length(data)+1:length(data)+

number_to_predict, fit$pred)

OR: autoplot(forecast(arima_model, level=c(95), h=number_to_predict))

Page 2: Time Series Cheat Sheet - Math Department

Basics

CC BY SA Christoph Sax • www.cynkra.com • Learn more at tsbox.help • package version 0.1.0 • Updated: 2019-04

Helper Functions

Class Agnostic Time Series with tsbox : : CHEAT SHEET Class Conversion

converter function ts-boxable class

ts_ts() ts, mtsts_data.frame(), ts_df() data.framets_data.table(), ts_dt() data.tablets_tbl() df_tbl, "tibble"ts_xts() xtsts_zoo() zoots_tibbletime() tibbletimets_timeSeries() timeSeriests_tsibble() tsibblets_tslist() a list with ts objects

tsbox is built around a set of converters, which convert time series of the following supported classes to each other:

COMBINE TIME SERIES

ts_c(mdeaths, austres)

collect time series of all classes and frequencies as multiple time series

wwwwts_bind(mdeaths, austres)

combine time series to a new, single time series (first series wins if overlapping)

ww wts_chain(mdeaths, austres)

like ts_bind, but extra- and retropolate, using growth rates

ww w

ts_trend(): Trend estimation based on loess500

550

600

650

1974 1975 1976 1977 1978 1979 1980

ts_pc(), ts_pcy(), ts_pca(), ts_diff(), ts_diffy(): (annualized) Percentage change rates or differences to previous period, year

-40

-20

0

20

40

60

1974 1975 1976 1977 1978 1979 1980

ts_scale(): normalize mean and variance

-1

0

1

2

3

1974 1975 1976 1977 1978 1979 1980

ts_lag(): Lag or lead of time series

0.4

0.6

0.8

1

1.2

1974 1975 1976 1977 1978 1979 1980

ts_index():Index, based on levels ts_compound(): Index, based on growth rates

Transform time series of all classes and frequencies

ts_frequency(fdeaths, "year")

ts_frequency(): convert to frequency

TRANSFORM

SPAN AND FREQUENCY

ts_span(): filter time series for a time span. ts_span(fdeaths, "1976-01-01") ts_span(fdeaths, "-5 year")

500

600

700

800

1974 1975 1976 1977 1978 1979 1980

ts_seas(): seasonal adjustment using X-13

ts_lag(fdeaths, 3) fdeaths

400

600

800

1000

1974 1976 1978 1980

530

540

550

560

570

580

590

1974 1975 1976 1977 1978 1979

400

600

800

1000

1975 1976 1977 1978 1979 1980

Time Series in data frames

Default structure to store multiple time series in long data frames (or data tables, or tibbles)

id time value

fdeaths 1974-01-01 901fdeaths 1974-02-01 689fdeaths 1974-03-01 827... ... ...

ts_df(ts_c(fdeaths, mdeaths))

tsbox auto-detects a value-, a time- and zero, one or several id-columns. Alternatively, the time- and the value-column can be explicitly named time and value. ts_default(): standardize column names in data frames

LONG STRUCTURE

RESHAPEts_wide(): convert default long structure to wide ts_long(): convert wide structure to default long

a <- ts_pc(AirPassengers)

Most functions in tsbox have the same structure:

returns a ts-boxable obect of the same class as input

IDEAtsbox provides a time series toolkit which: 1. works identically with most time series classes 2. handles regular and irregular frequencies 3. converts between classes and frequencies

first argument is any ts-boxable object

function starts with ts_

PLOT AND SUMMARIZEPlot time series of all classes and frequencies

ts_plot(mdeaths, austres) ts_ggplot(mdeaths, austres)

mdeaths austres

5000

10000

15000

1975 1980 1985 1990

USE WITH PIPE

library(dplyr) ts_c(fdeaths, mdeaths) %>% ts_tbl() %>% ts_trend() %>% ts_pc()

tsbox plays well with tibbles and with %>%, so it can be easily integrated into a dplyr/pipe workflow

ts_lag(fdeaths, 4)

ts_seas(fdeaths)

ts_index(fdeaths, base = 1976)

ts_scale(fdeaths)

ts_pc(fdeaths)

ts_trend(fdeaths)

pass return value as first argument to the next

function

ts_summary(ts_c(mdeaths, austres))

AUTO-DETECT COLUMN NAMES