In this note we explore several mechanisms of how to construct time series. For us a time series is a sequence of real numbers
where $t$ is the time index parameter. A model for a time series is a sequence of random variables
Given a time series model we can generate many different time series (instances) by drawing random samples $Y_t(\omega)$ for $\omega \in \Omega$.
The following classes of time series will be discussed:
As we will see this constructions allow us to create a great variety of different time series. The generation of sample time series is the first step in gaining an understanding of these models. Time series analysis is concerned with reversing these constructions and infere model type and parameters from time series data.
The easiest representation of a time series is using a simple array:
y = [1,2,1,0,-3,-1]
Elements can be addressed with the syntax y[t]
. The drawback of these representation
is that there is an explicit length of the time series, which was supposed to go on forever.
Also the time dependence is very explicit. It is often not meaningful to say, give
me the sample at time $0$, but instead we have a sample $y_t$ at a time $t=now$,
and we want to have the next 5 samples, say.
I turns out that, python comes with a built in iterator type that is well suited for representing time series. Iterators are simple objects I
that provide a method I.next()
that returns the next value of the series. Note that there is no way to access previously generated values.
Iterator object can be conveniently generated using the yield
keyword.
The following example constructs a white noise iterator:
# Example construction of an iterator
def WhiteNoise():
while True:
yield np.random.random()
eps = WhiteNoise()
print eps.next(), eps.next(), eps.next()
0.963354997711 0.175658838635 0.995743134001
For convenience we provide a class Series
which subclasses the iterator
class and implements methods for infix operators, e.g. __add__
. In this way we can write Z=Y + 4
for the time series with Z.next() = Y.next() + 4
.
Also we have written some plotting helper function
Plot(S)
plots a time series (iterator)FancyPlot(S)
plots a time series together with the histogramThe source code for these extensions can be found on GitHub.
We start by constucting some fundamental examples of time series.
The constant series, $Const(c)$ with value $c \in \mathbb{R}$ is defined as a series with .
Given a probability distribution $\mathcal{D}$ on the real line, we say that a process is $\mathcal{D}$-iid noise if
and all random variables $Y_t$ are independent. The constant series is IID noise with distribution $\delta_c$.
Standard white noise $\epsilon$ is defined as $IID(\mathcal{N}(0,1))$ noise with standard gaussian normal distribution.
Other important distributions for our purposes are the Bernoulli Distribution $\mathcal{D} = Ber_p$, with
and the Binomial Distribution $\mathcal{B}(n,p)$ with
@ToSeries
def IID(Dist):
while True:
yield Dist()
def Const(c):
"Constant series"
return IID(lambda: c)
def Ber(p):
"Bernoulli noise"
return IID(stats.bernoulli(p).rvs)
def B(n,p):
"Binomial noise"
return IID(stats.binom(n,p).rvs)
def N(mean, variance):
"Normal noise"
return IID(stats.norm(mean, variance).rvs)
def Exp(l):
"Exponentaial noise"
return IID(stats.expon.rvs)
eps = N(0,1)
nul = Const(0)
# Normal Noise
FancyPlot(eps)
# Random Bernoulli spikes
FancyPlot(Ber(0.05), 'o--',N=500)
# Binomial spikes
FancyPlot(B(10,0.05),'o--',N=500)
# Binomial spikes with exponential height
FancyPlot(Ber(0.1)*Exp(0.1),'o--',N=500)
Our convenience class Series, allows us to do simple arithemtic on time series using the familiar infix notation.
FancyPlot(3*eps+5)
FancyPlot(eps*eps)
Random wals are realized as comulative sums over a series.
@ToSeries
def Walk(I):
s = 0
for y in I:
s += y
yield s
# Standard random walk over white noise
FancyPlot(Walk(eps))
Plot(nul)
# random walk over squared white noise
FancyPlot(Walk(eps*eps))
# Walking a Bernoulli process
FancyPlot(Walk(Ber(0.01)))
# Walking spikes of random height
FancyPlot(Walk(Exp(1)*Ber(0.05)))
Differencing is the inverse to comulative sums.
@ToSeries
def Diff(I):
l = I.next()
for y in I:
yield y - l
l = y
def DiffK(k,I):
"returns the k-fold iterated difference of I"
if k == 0: return I
if k > 0: return DiffK(k-1,Diff(I))
if k < 0: return DiffKa(k+1,Walk(I))
# Differences of white noise looks like whie noise
FancyPlot(Diff(eps))
# Differencing reverses a random walk (up to one sample dealy)
Y = Sample(eps, N=100)
FancyPlot(Walk(Y),'b--') # random walk in blue
Plot(Y,'g') # original samples in green
Plot(Diff(Walk(Y)),'r') # differences in red
# Differencing a Bernoulli series
FancyPlot(Diff(Ber(0.01)))
import copy
def Window(N,I):
H = deque(maxlen=N)
for i in range(N): H.appendleft(I.next())
for y in I:
yield list(H)
H.appendleft(y)
@ToSeries
def Conv(C,I):
"Convolution of a Series I with a vector C"
for W in Window(len(C),I):
yield sum(map(lambda x: x[0]*x[1], zip(C[::-1],W)))
def Smooth(N,I):
return Conv([1./N]*N,I)
@ToSeries
def ExpSmooth(l, I):
g = I.next()
for y in I:
g = l * y + (1-l) * g
yield g
# Smoothing a random walk
Y = Sample(Walk(eps), N = 100)
Plot(Y)
Plot(Smooth(10, Series(Y)))
Plot(ExpSmooth(0.1,Series(Y)))
# Exponential Smoothing of a Bernoulli series
Y = Sample(B(3,0.05), N=500)
FancyPlot(ExpSmooth(0.2,Series(Y)))
Plot(Y,'o--', alpha=0.5)
A markov chain consists of
A realization of a markov chain is a sequence of random variables $Y_t \in S$, such that:
The transition probabilities are given by:
The markov property has to hold:
Makrov chains are very easy to generate. We start with a random choice of start state $y_0 \in S$ sampled according to $\pi$. Then, we iteratively generate the further states $y_t$. If we are in a state the next state is ranomly sampled from $1, \dots, N$ according to the the distribution .
def RandIndex(W):
"""
returns random index of P, where
P[RandIndex(W) = i] = W[i].
"""
t = np.random.random()
for i,p in enumerate(np.cumsum(W)):
if p > t: return i
@ToSeries
def Markov(T, pi=None, start=0):
"""
Markov chain series with:
T - transition matrix
pi - initial distribution either as
start - initial state (if pi == None)
"""
# determine initial state
if pi is None:
s = start
else:
s = RandIndex(pi)
# Walk in chain
while True:
yield s
s = RandIndex(T[s])
def MBer(p):
"Ber_p experiement as markov chain"
return Markov(
[[ 1-p, p ],
[ 1-p, p ]]
)
FancyPlot(MBer(0.1),'o--',N=100)
def M1(p):
"Change state with probability p"
return Markov(
[[1-p, p ],
[p, 1-p ]]
)
FancyPlot(M1(0.1),'o--',N=100)
# Three state markov model
def M2(p): return Markov(
[[1-p, p , 0 ], # state 0: change to 1 with prob p
[0 , 1-p, p ], # state 1: cahnge to 2 with prob p
[1 , 0 , 0 ]] # state 3: drop back to 1
)
FancyPlot(M2(0.1),'o--',N=100)
Given a time series model $S_t$ with discrete values in ${1, \dots, K}$, and for each $k = 1…K$ another time series model $Y^k_t$, we can produce a two step model as follows. Depending on the value $S_t$ we select one of the time series $Y^1_t,\dots,Y^K_t$ to draw the sample from:
@ToSeries
def Selector(S,D):
"""
Two step time series model:
S -- selector series with values in [1..len(D)]
D -- list of series to be selected from
"""
for s in S:
yield D[s].next()
We have $Y_t ~ \N(0,s)$, and $s = 1$ or $s = 10$ depending on a Bernoulli experiment.
FancyPlot(Selector(Ber(0.01),[eps, 10*eps]))
In this case the selector process $S_t$ is a markov model.
S = Sample(M1(0.01))
FancyPlot(Selector(S, [eps, 5*eps]))
Plot(5*S,'r')
# Change in mean according to three step process
S = Sample(M2(0.01))
D = [eps, 5+eps, 10+eps]
FancyPlot(Selector(S, D))
Plot(5*S,'r')
# Change in variance according to three step process
S = Sample(M2(0.01))
D = [eps, 10*eps, 20*eps]
FancyPlot(Selector(S, D))
Plot(5*S,'r')
Instead of a discrete state sequence $S_t \in {1,\dots, N}$, we can consider continues state sequences. An natural choice for the state space is a real vector space $V$. The state vector $S_t$ is a random process in $V$:
A natural choice for the transition functions are linear maps:
and
where $\eta_t \in V$ is a $V$-valued random noise process (with expectation $0$).
As above the state sequence is not directly observable. Instead, the observations are derived from the state using a linear form plus error term:
Here $\lambda: V \lra \IR$ is a fixed linear form, and $\eps_t$ is a noise process with expectation $0$ (e.g. white noise).
@ToSeries
def StateSpace(T,eta,l,eps,S0=None):
"""
Implemenation of a state space model with parameters:
T - state transition matrix
eta - state error term
l - observation function (vector)
eps - observation error
"""
dim = len(T) # dimension of state space
if S0 is None: S0 = [0]*dim
if eta == 0: eta = Const([0] * dim)
if eps == 0: eps = Const(0)
S = S0
while True:
S = np.dot(T,S) + eta.next()
yield np.dot(l,S) + eps.next()
Using the state space model, we can reproduce a one-dimensional random walk:
The state noise is given by $\eta_t \sim \N(0,1)$ white noise, and the observation noise $\eps_t = 0$.
In total the update rule looks as follows:
RW=StateSpace([[1]],0.1*eps,[1],0)
FancyPlot(RW)
We consider a two dimensional state space with a rotating state vector. The observation map is the projection to the y axis.
alpha = 2 * np.pi / 50
T = [[ np.cos(alpha), -np.sin(alpha) ],
[ np.sin(alpha), np.cos(alpha) ]]
l = [0,1]
Eps = IID(lambda: [stats.norm(0,1).rvs(),stats.norm(0,1).rvs()])
RW=StateSpace(T,Eps,l,eps)
FancyPlot(RW)