Market Analysis


An application of Wavelet Theory by LTSM-RNN-CNN Architecture

This project is a automated trading bot that can both buy long and short positions on the stock market utilizing the theoretical framework of wavelet theory to inform it's trade

Agent Performance

Using the most recent 150 days of our data as validation on a random sample of tickers from the S&P-1500, The agent has a relative profit of ~0.05% per day compared to the underlying asset's performance (i.e. buy and hold). This translates to beating the underlying asset by ~20% over the year assuming compound gains from reinvestment.

Source Code

The source code can be found here

Features For Each Stock Ticker

  • Open, Close, High, Low, EMA(12), EMA(26), MA(6), DI, MACD, and PSY(13)

The above features are preformed on data that's been differenced once i.e. X~t=ΔXt=XtXt1\widetilde{X}_t = \Delta X_t = X_t - X_{t-1}. This is done as financial instruments tend to show a linear trend over a short time period, this can be tested using a stationary test such as the Augmented Dickey Fuller Test. Given the Trading bot cares more about price direction/action rather than price level the first difference will give us more relevant information.

Furthermore features were chosen to give a wide variety of information upfront to improve training efficiency.


Theoretical Motivation

Usually the focus of analysis on a time series stays within the time-domain but there's a set of tools and methods available for the frequency-domain. Given a zero-mean stationary time series XtX_t we can utilize the fact of the spectral representation of a stationary process to write

Xt=(π,π]eihλdZ(λ),π<λπX_t = \int_{(-\pi, \pi]} e^{ih\lambda}dZ(\lambda), \pi < |\lambda| \leq \pi

Where Z(λ)Z(\lambda) is a complex-valued process with uncorrelated increments. In essence this means we can represent any stationary time series in the frequency-domain and not lose any information (or vice-versa).

This is of particular importance when a process is known to exhibit some form of periodic behavior suppose XtX_t is stochastic periodic series, i.e.

Xt=Acos(ωt)+Bsin(ωt);A,BN(μA,B,σA,B)X_t = A cos(\omega t) + B sin(\omega t);\enspace A, B \sim N(\mu_{A, B}, \sigma_{A, B})

Given ω\omega is a constant fix ω=π\omega = \pi for simplicity, then we have

Xt=Acos(πt)+Bsin(πt)X_t = A cos(\pi t) + B sin(\pi t)

We see for t=0t = 0

X0=Acos(0)+Bsin(0)=AX_0 = A cos(0) + B sin(0) = A

And for t=2t = 2

X2=Acos(2π)+Bsin(2π)=AX_2 = A cos(2\pi) + B sin(2\pi) = A

In fact t:t=2k;kZ\forall t: t = 2k;\enspace k \in \Z

Xt=A,E(Xt)=μAX_t = A,\enspace E(X_t) = \mu_A

And t:t=2k+1;kZ\forall t: t = 2k + 1;\enspace k \in \Z

Xt=B,E(Xt)=μBX_t = B,\enspace E(X_t) = \mu_B

With a stochastic periodic function it returns to a particular mean at given intervals. The frequency-domain would then give us insight at when XtX_t completes one full period, I.e. it would spike at t=2t = 2. While this is trivial let

Yt=Ccos(2ωt)+Dsin(2ωt);C,DN(μC,D,σC,D)Y_t = C cos(2\omega t) + D sin(2\omega t);\enspace C, D \sim N(\mu_{C, D}, \sigma_{C, D})
Wt=Xt+YtW_t = X_t + Y_t

The frequency-domain for WtW_t would give us a spike at t=1,2t = 1, 2 Generalizing this the frequency domain allows us to extract periodic components/signals from a time series. If we wanted to give it an upward trend such that on average it would increase over time all we need to do is add tt.

W~t=Xt+Yt+ct,cR+\widetilde{W}_t = X_t + Y_t + ct,\enspace c \in \R^+

Note that under a difference operator Δ\Delta

ΔW~t=W~tW~t1=Xt+Yt+ctXt1Yt1c(t1)=ΔXt+ΔYt+c\begin{aligned} \Delta \widetilde{W}_t &= \widetilde{W}_t - \widetilde{W}_{t-1} \\ &= X_t + Y_t + ct - X_{t-1} - Y_{t-1} - c(t - 1) \\ &= \Delta X_t + \Delta Y_t + c \end{aligned}

Without loss of generality we have

ΔXt=XtXt1=Atcos(πt)+Btsin(πt)At1cos(π(t1))Bt1sin(π(t1))\begin{aligned} \Delta X_t &= X_{t} - X_{t-1} \\ &= A_t cos(\pi t) + B_t sin(\pi t) - A_{t-1} cos(\pi (t - 1)) - B_{t-1} sin(\pi (t - 1)) \\ \end{aligned}

Which is another periodic stochastic time series.

If the periodicity is fixed like above then the intervals are regular and we can use the Spectral Fourier Transform. However, not all time series exhibit regular periodic intervals, especially financial instruments, and hence we will have use more generalized approach, Wavelets.

A Wavelet

A Wavelet allows us to operate on the frequency-domain. We start by introducing two functions: mother wavelet ψ(t)\psi(t) (the wavelet function) and a father wavelet ϕ(t)\phi(t) (the scaling function).

A low-pass and high-pass filter can be derived from both the mother and father wavelet functions. The mother wavelet is a high-pass filter and the father wavelet is the low-pass filter. A standard example of a low-pass filter is any convolution, say a moving average. Where an example of a high-pass filter is a differencing operation i.e. XtXt1X_t - X_{t-1}. The low pass filter extracts the lower-frequency patterns while the high-pass filter extracts the higher-frequency patterns. Together these make up a band-pass filter which allows you to exact and isolate both low and high signals from your data - without implying the periodicity stays the same.

Given these two facts hold:

  1. A father wavelet can be seen in a CNN by definition you performing a convolution hence you are extracting low-frequency data
  2. A mother wavelet can be seen in a LTSM-RNN byallowing functional combinations of previous lags through LTSM cells (A difference operator is just a linear combination)

Thus LTSM-RNN-CNN architecture will allow an emergent property something similar to a trainable wavelet function in you have both the mother and father wavelet allowing you to exact frequency-domain information from the given time series.

Theoretical Implications

Now given that for the U.S. market on average financial time series are highly cyclical, and trending upwards. Denote PtP_t as the profit at time tt, where P0=0P_0 = 0. Suppose we bought and held for the entire time, then

Pt=ct,cR+P_t = ct, \enspace c \in \R^+

Suppose you bought at t=0t=0 and then sold and rebought every cycle. Let γ\gamma be the difference between the selling and rebuying price. For simplicity suppose the buy and hold strategy is represented by selling and immediately rebuying i.e. γ\gamma = 0, then

Pt=ct+γ.P_t = ct + \gamma.

Maximum profit would be selling the maximum value of each cycle and buying the minimum value of each cycle, however any γ>0\gamma > 0 will beat the strategy of buying and holding. Hence, being informed about the periodic behavior of the underlying time series can allow you to capture more profit.