Refinements of Barndorff-Nielsen and Shephard model: an analysis of crude oil price with machine learning

  • 2019-11-29 18:57:09
  • Indranil SenGupta, William Nganje, Erik Hanson
  • 2

Abstract

A commonly used stochastic model for derivative and commodity market analysisis the Barndorff-Nielsen and Shephard (BN-S) model. Though this model is veryefficient and analytically tractable, it suffers from the absence of long rangedependence and many other issues. For this paper, the analysis is restricted tocrude oil price dynamics. A simple way of improving the BN-S model with theimplementation of various machine learning algorithms is proposed. This refinedBN-S model is more efficient and has fewer parameters than other models whichare used in practice as improvements of the BN-S model. The procedure and themodel show the application of data science for extracting a "deterministiccomponent" out of processes that are usually considered to be completelystochastic. Empirical applications validate the efficacy of the proposed modelfor long range dependence.

 

Quick Read (beta)

Refinements of Barndorff-Nielsen and Shephard model: an analysis of crude oil price with machine learning

Indranil SenGupta11 1 Associate Professor and Graduate Program Director, Department of Mathematics, North Dakota State University, Fargo, North Dakota, USA. Email: [email protected], William Nganje22 2 Professor and Department Chair, Department of Agribusiness and Applied Economics at North Dakota State University, Fargo, North Dakota, USA., Erik Hanson33 3 Assistant Professor, Department of Agribusiness and Applied Economics at North Dakota State University, Fargo, North Dakota, USA.
December 2, 2019
Abstract

A commonly used stochastic model for derivative and commodity market analysis is the Barndorff-Nielsen and Shephard (BN-S) model. Though this model is very efficient and analytically tractable, it suffers from the absence of long range dependence and many other issues. For this paper, the analysis is restricted to crude oil price dynamics. A simple way of improving the BN-S model with the implementation of various machine learning algorithms is proposed. This refined BN-S model is more efficient and has fewer parameters than other models which are used in practice as improvements of the BN-S model. The procedure and the model show the application of data science for extracting a “deterministic component” out of processes that are usually considered to be completely stochastic. Empirical applications validate the efficacy of the proposed model for long range dependence.

Key Words: Machine Learning, Deep Learning, Stochastic Model, Lévy Processes, Subordinator.

1 Introduction

One of the most prominent tools in modern big data analysis is machine learning. Machine learning is about extracting knowledge from a significantly large data set. The application of machine learning methods has recently become ubiquitous in everyday life. Machine learning has had a tremendous influence on the way data-driven research is done today. The tools can be applied to diverse scientific problems such as understanding stars, finding distant planets, discovering new particles, analyzing DNA sequences, and providing personalized cancer treatments.

In this paper, we apply machine learning to the analysis of crude oil price data. Crude oil data analysis is potentially a very important subject. In order to understand the data, we collect ten years of daily historical price data for crude oil. In the next step, we perform data wrangling, which largely deals with cleaning up data. Here all data that do not add any value to our business solution need are eliminated and only those data required for solving the business need are kept. After the data cleaning happens, we conduct the exploratory data analysis. In the exploratory data analysis, we look at the basic statistics of the data such as its mean, median, and mode and correlations between the different labels. We also identify whether the data is comprised of numerical or categorical variables, etc. This exploratory data analysis gives direction to the model building.

Many existing works are dedicated to the dynamics of crude oil prices. In [15], various econometric models used to forecast crude oil prices are summarized and interpreted. In [14], a deep learning model is applied to crude oil prices and a hybrid crude oil price forecasting model is provided. In [13], oil producers’ decisions in Cournot competitions are described through continuum dynamic mean field games. In related work (see [12]), a modified Hotelling’s rule for games with stochastic demand is discussed.

Paper [22] uses a convolutional neural network to forecast crude oil prices through online media text mining. Paper [2] discusses applications of the hierarchical conceptual model and the artificial neural networks-quantitative model to crude oil prices. In [27], denoising autoencoders and bootstrap aggregation are combined to forecast crude oil prices. Paper [17] evaluates the accuracy of machine learning support vector regression models for forecasting crude oil prices.

The application of machine learning to other financial data is also becoming more common. In [20], a machine learning algorithm is applied to state-contingent claims and stochastic discount factors in financial markets. In [21], a machine learning algorithm is implemented to determine whether bank-differentiating factors influence firm choices in initial public offerings. In [24], a multicriteria decision aid model is used in an attempt to replicate the credit ratings of Asian banks.

A commonly used stochastic model for the derivative and commodity market analysis is the Barndorff-Nielsen and Shephard (BN-S) model (see see [4, 6, 7, 8, 19, 26]). Though this model is very efficient and simple to use, it suffers from the absence of a long range dependence and many other issues. In this paper, we propose a simple way of improving the BN-S model with the implementation of various machine learning algorithms. After that, we validate the performance of the model. We use staging data sets that are close to production and see how our model behaves; if it gives good results, then the model is deployed and it is implemented. Finally, feedback is used to determine whether the model meets the business need for which it was built.

The organization of the paper is as follows. In Section 2, we briefly describe the BN-S model and why an improvement of this model is necessary for the analysis of crude oil price data. We find that the improvement of the model depends on machine learning analysis of the crude oil price data. The data analysis is provided in Section 3. A brief conclusion is provided in Section 4.

2 An improved Barndorff-Nielsen and Shephard model

Many models in recent literature try to capture the stochastic behavior of time series. For example, in the case of the BN-S model, the stock or commodity price S=(St)t0 on some filtered probability space (Ω,,(t)0tT,) is modeled by

St=S0exp(Xt), (2.1)
dXt=(μ+βσt2)dt+σtdWt+ρdZλt, (2.2)
dσt2=-λσt2dt+dZλt,σ02>0, (2.3)

where the parameters μ,β,ρ,λ with λ>0 and ρ0 and r is the risk-free interest rate where a stock or commodity is traded up to a fixed horizon date T. In this model Wt is a Brownian motion and the process Zt is a subordinator. Also Wt and Zt are assumed to be independent and (t) is assumed to be the usual augmentation of the filtration generated by the pair (Wt,Zt).

However, the empirical data suggest that volatility (σt) usually fails to respond immediately to the sudden fluctuation of a stock or commodity price. The issue of the market’s delayed response was raised in several papers (see [5, 11, 16]). Paper [3] deals this issue with a delayed option price formula where the volatility has the form σ(St-b), for some delay parameter b>0.

However, the results and the theoretical framework are far from satisfactory. There are problems related to the above model:

  1. 1.

    Empirical results show that the jumps in volatility and stock or commodity price are positively correlated. However, unlike what is suggested by the model, they may not occur at the same time.

  2. 2.

    For empirical data, the delay parameter b is not deterministic.

  3. 3.

    The performance of the model varies considerably depending both on the length and the density of time in the observed time series. Slow convergence is essentially caused by high serial correlation between the latent variables and the parameters. The problem is particularly acute in the case of a sparsely observed time series, or any case in which the time series contains many data.

  4. 4.

    The BN-S model does not incorporate the long range dependence property. The model fails significantly for a longer range of time. In some occasions, even for time spans as small as two weeks, the model is unable to consistently capture the essential features of the related time series.

Some of these problems are addressed in various recent works. For example, in [25], the author presents a generalized version of the BN-S model. Assuming Zt and Zt* to be two independent Lévy subordinators, define

dZ~λt=ρdZλt+1-ρ2dZλt*, (2.4)

which is also a Lévy subordinator provided 0ρ1. Thus, for 0ρ1, Z and Z~ are positively correlated Lévy subordinators. Suppose the dynamics of St are given by (2.1), (2.2), where σt is given by

dσt2=-λσt2dt+dZ~λt,σ02>0, (2.5)

where Z~=(Z~λt) is a subordinator independent of W but has a positive correlation with Z as described above. Assume that the dynamics of S=(St) is given by (2.1), (2.2) and (2.5). In [25], it is shown that this generalized model has the liberty to fit the option price and volatility in a correlated but different way, which is not possible for the case of the classical BN-S model. This result is used for pricing vanilla options and developing theorems for parameter estimations of some particular variance processes.

The literature (see [4, 18]) shows that superpositions of Ornstein-Uhlenbeck (OU) type processes can be used to achievelong range dependence. A limiting procedure creates processes that are self-similar with stationary increments. However, paper [6] warns against fitting a large quantity of OU processes via a formal likelihood-based method. An alternative approach is to use heavy-tailed jump distributions in the model.

In this paper, we will address issues #2, #3 , and #4 described above. We will show that for crude oil price dynamics, the jump is not completely stochastic. On the contrary, there is a deterministic element in crude oil price that can be implemented to apply the existing models for an extended period of time. We will show from an empirical analysis that the dynamics of Xt in (2.2) can be more accurately written when we use a convex combination of two independent subordinators, Z and Z(b) as:

dXt=(μ+βσt2)dt+σtdWt+ρ((1-θ)dZλt+θdZλt(b)), (2.6)

where θ[0,1] is a deterministic parameter. We will use several machine learning algorithms to determine the value of θ. The process Z(b) in (2.6) is a subordinator that has greater intensity than the subordinator Z. In this case (2.5) will be given by

dσt2=-λσt2dt+(1-θ)dZλt+θdZλt(b),σ02>0, (2.7)

where, as before, θ[0,1] is deterministic. For simplicity, we assume θ=θ for the rest of this paper.

Theorem 2.1.

If the jump measure associated with the subordinator Z be JZ, and J(s)=0sR+JZ(λdτ,dy), then for the log-return of the classical BN-S model given by (2.1), (2.2), and (2.3),

𝐶𝑜𝑟𝑟(Xt,Xs)=0sστ2𝑑τ+ρ2J(s)(0tστ2𝑑τ+tρ2λ𝑉𝑎𝑟(Z1))(0sστ2𝑑τ+sρ2λ𝑉𝑎𝑟(Z1)), (2.8)

for t>s.

Proof.

Clearly, for t>s

Cov(Xt,Xs)=0sστ2𝑑τ+ρ20s+JZ(λdτ,dy).

Note that the instantaneous variance of the log-return is given by (σt2+ρ2λVar(Z1))dt. Consequently we obtain (2.8). ∎

Note that for a fixed s, if t increases, then Corr(Xt,Xs) quickly decreases. The proof of the following result is very similar to the proof of Theorem 2.1.

Theorem 2.2.

If the jump measures associated with the subordinators Z and Z(b) are JZ and JZ(b) respectively, and J(s)=0sR+JZ(λdτ,dy), J(b)(s)=0sR+JZ(b)(λdτ,dy); then for the log-return of the improved BN-S model given by (2.1), (2.6), and (2.7),

𝐶𝑜𝑟𝑟(Xt,Xs)=0sστ2𝑑τ+ρ2(1-θ)2J(s)+ρ2θ2J(b)(s)α(t)α(s), (2.9)

for t>s, where α(ν)=0νστ2𝑑τ+νρ2λ((1-θ)2𝑉𝑎𝑟(Z1)+θ2𝑉𝑎𝑟(Z1(b))).

Note that as θ is constantly adjusted, for a fixed s, the value of t always has an upper limit. Consequently, Corr(Xt,Xs) never becomes “too small”. This is the major difference between the results in Theorem 2.1 and Theorem 2.2.

The advantages of the dynamics given by the refined BN-S model given by (2.1), (2.6), and (2.7), over the existing models are significant. First of all, this minor change in the model incorporates long range dependence without actually changing the model. This model will be more efficient, but at the same time have many fewer parameters than the superposition models. Secondly, the performance of this model for a sparsely observed time series will be improved. Thirdly, an estimation the delay parameter b (mentioned in #2) can be obtained. Finally, and possibly most importantly, the procedure and the model show the application of data science for extracting a deterministic component out of processes that are thus far considered to be completely stochastic. For this paper, we restrict our analysis for the crude oil price dynamics. However, this method possibly can be implemented for any compatible time series.

3 Data analysis

We consider crude oil price data over a period of 10 years. We use the West Texas Intermediate (WTI or NYMEX) crude oil prices data set for the period June 1, 2009 to May 30, 2019 (Figure 1). There are a total of 2,530 available data in this set. For convenience, we index the dates (for available data) from 0 (for June 1, 2009) to 2529 (for May 30, 2019). The following table (Table 1) summarizes various estimates for the data set.

Table 1: Properties of the empirical data set.
Daily Price Change Daily Price Change %
Mean -0.0047 0.01370 %
Median 0.04399 0.06521 %
Maximum 7.62 12.32 %
Minimum -8.90 -10.53 %
Figure 1: Crude Oil close price.

We implement the following algorithm (Step 1 through Step 5) that creates a classification problem for the data set. For the data set:

  1. 1.

    We conduct exploratory data analysis.

  2. 2.

    We consider the daily close price for the historical oil price data. From the plots we identify a value of K to define a “big jump” in the crude oil close price. We identify the dates for which the close price is K “points” less than the close price of the previous day (for example, if K=1%, we will find the dates for which the close price is 1% below the previous business day).

  3. 3.

    We create a new data-frame from the old one where “features” (columns) will be seven consecutive close prices. For example, if the close prices are

    a1,a2,a3,a4,a5,a6,a7,a8,a9,a10,;

    then the first row of the data set will contain

    a1,a2,a3,a4,a5,a6,a7;

    second row of the data set will contain

    a2,a3,a4,a5,a6,a7,a8;

    etc.

  4. 4.

    We create a new target column for the new data-frame (as created in the preceding step) as follows: θ=1 for those set of seven close prices that immediately precede at least two jumps of size K (or more) in the following seven days. Otherwise we label the target column by θ=0.

    For example: suppose we identified a8 and a10 as “big jumps”. Then the θ=1 for the first row a1,a2,a3,a4,a5,a6,a7.

  5. 5.

    We run various classification algorithms from machine learning where the input is the close price for seven consecutive days, and output is θ-value (0 or 1). We evaluate the classification report and confusion matrix in each case.

We will show that we can find θ with reasonable accuracy and use this for (2.6). The result can be improved by adjusting the value of K in Step 2. The result can be further improved by increasing the number of days (in Step 3) from seven to a higher number. It is worth noting that the various deep learning models provide a value of θ between 0 and 1. In Step 4, we approximate that by 0 or 1. However, the actual value of θ may be directly used in (2.6).

Figures 2, 3, and 4 provide various visualizations of crude oil close prices. Figures 5 and 6 provide a histogram of the daily price change and a histogram of daily percentage change, respectively. We partition this data set in various ways. For each partition we use a train-test-split, with respect to a given date. For the following analysis we use K=2%, i.e., θ=1 for the set of seven close prices that immediately precede at least two jumps of size 2% (or more) in the following seven days. Otherwise, we use θ=0.

Figure 2: Yearly boxplot for the close price.
Figure 3: Distribution plot for close price.
Figure 4: Bar chart for the close price.
Figure 5: Histogram for daily change in close price.
Figure 6: Histogram for daily change percentage in close price.

We run various supervised learning algorithms on the crude oil price data. We begin with the logistic regression (LR) and the random forest (RF) classification of the data set. It is well known that for the logistic regression classification, given a testing data X, (θ=1|X)=11+e-β0-β1X, where the quantity β0 and the vector β1 are determined from the training set with the help of an appropriate log-likelihood function. The random forest classification of many decision trees with a random sample of features is used. By randomly leaving out candidate features from each split, random forest decorrelates the trees, such that the averaging process can reduce the variance of the resulting model.

After that, we implement various deep learning techniques:

  1. (A)

    A neural network with two hidden layers (with activations tanh and ReLU) and an output layer (with a softmax activation function). For simplicity we approximate θ in (2.6) with 0 (“no big jump”) and 1 (“big jump”). For this approximation, we take θ=1 if the output probability for the softmax activation function corresponding to θ=1 is more than 0.3.

  2. (B)

    Long short-term memory (LSTM) along with the neural network described in (A). LSTM is an artificial recurrent neural network (RNN) architecture that is implemented to avoid the vanishing gradient problem. The vanishing gradient problem is especially prominent when a vanilla RNN, constructed from regular neural network nodes, is implemented to model dependencies between time series values that are separated by a significant number of days. LSTM has in-built feedback connections that make it appropriately implementable for a financial time series. A common LSTM unit is composed of a cell, an input gate, an output gate, and a forget gate. The cell retains values over arbitrary time intervals and the other three gates regulate the flow of information into and out of the cell.

  3. (C)

    LSTM along with a batch normalizer (BN) and the neural network described in (A). A batch normalizer standardizes and rescales the output of a given layer in the deep network. To increase the stability of a neural network, batch normalization normalizes the output of a previous activation layer by subtracting the batch mean and dividing by the batch standard deviation. It also reduces the amount the hidden unit values shift around (i.e., its covariance shift). This process centers all the inputs around zero. This way, there is not much change in each layer input. Consequently, layers in the network can learn from the back-propagation simultaneously, without waiting for the previous layer to learn. This speeds up the training of networks.

For the following tables (Table 2 through Table 13), we provide classification reports for various machine learning algorithms. For the testing data, true positive, true negative, false positive, and false negative are denoted as TP, TN, FP, and FN, respectively. The following measurements are standard:

𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛=TPTP+FP,
𝑟𝑒𝑐𝑎𝑙𝑙=TPTP+FN.

The f1-score gives the harmonic mean of precision and recall. The scores corresponding to every class gives the accuracy of the classifier in classifying the data points in that particular class compared to all other classes. The support is the number of samples of the true response that lie in that class.

Table 2: Various estimations for training date(index): October 21, 2009 (100) to May 24, 2011 (500); and testing date(index): May 25, 2011 (501) to October 14, 2011 (600).
LR RF Neural Network (A) LSTM (B) BN (C)
precision θ=0 0.56 0.57 0.56 0.62 0.56
recall θ=0 0.96 0.91 0.70 0.79 0.65
f1-score θ=0 0.71 0.70 0.62 0.69 0.60
support θ=0 57 57 57 57 57
precision θ=1 0.00 0.50 0.41 0.57 0.43
recall θ=1 0.00 0.11 0.27 0.36 0.34
f1-score θ=1 0.00 0.19 0.33 0.44 0.38
support θ=1 44 44 44 44 44
Table 3: Various estimations for training date(index): : October 21, 2009 (100) to October 14, 2011 (600); and testing date(index): October 17, 2011 (601) to August 1, 2012 (800).
LR RF Neural Network (A) LSTM (B) BN (C)
precision θ=0 0.83 0.83 0.83 0.83 0.81
recall θ=0 0.99 0.91 0.54 0.62 0.62
f1-score θ=0 0.91 0.87 0.65 0.71 0.70
support θ=0 168 168 168 168 168
precision θ=1 0.00 0.12 0.15 0.16 0.11
recall θ=1 0.00 0.06 0.42 0.36 0.24
f1-score θ=1 0.00 0.08 0.23 0.22 0.15
support θ=1 33 33 33 33 33
Table 4: Various estimations for training date(index): August 9, 2010 (300) to August 1, 2012 (800); and testing date(index): August 2, 2012 (801) to May 17, 2013 (1000).
LR RF Neural Network (A) LSTM (B) BN (C)
precision θ=0 0.92 0.92 0.91 0.91 0.92
recall θ=0 1.00 0.92 0.58 0.58 0.58
f1-score θ=0 0.96 0.92 0.71 0.71 0.71
support θ=0 185 185 185 185 185
precision θ=1 0.00 0.07 0.06 0.07 0.07
recall θ=1 0.00 0.06 0.31 0.38 0.38
f1-score θ=1 0.00 0.06 0.10 0.12 0.12
support θ=1 16 16 16 16 16
Table 5: Various estimations for training date(index): May 17, 2013 (1000) to December 17, 2014 (1400); and testing date(index): December 18, 2014 (1401) to May 13, 2015 (1500).
LR RF Neural Network (A) LSTM (B) BN (C)
precision θ=0 0.48 0.48 0.47 0.48 0.50
recall θ=0 1.00 1.00 0.96 1.00 0.98
f1-score θ=0 0.64 0.65 0.63 0.65 0.66
support θ=0 48 48 48 48 48
precision θ=1 0.00 1.00 0.50 1.00 0.86
recall θ=1 0.00 0.02 0.04 0.04 0.11
f1-score θ=1 0.00 0.04 0.07 0.07 0.20
support θ=1 53 53 53 53 53
Table 6: Various estimations for training date(index): March 5, 2014 (1200) to May 13, 2015 (1500); and testing date(index): May 14, 2015 (1501) to October 5, 2015 (1600).
LR RF Neural Network (A) LSTM (B) BN (C)
precision θ=0 0.45 0.48 0.52 0.43 0.48
recall θ=0 0.96 0.94 0.62 0.45 0.83
f1-score θ=0 0.62 0.64 0.56 0.44 0.61
support θ=0 47 47 47 47 47
precision θ=1 0.00 0.70 0.60 0.50 0.60
recall θ=1 0.00 0.13 0.50 0.48 0.22
f1-score θ=1 0.00 0.22 0.55 0.49 0.32
support θ=1 54 54 54 54 54
Table 7: Various estimations for training date(index): July 28, 2014 (1300) to February 29, 2016 (1700); and testing date(index): March 1, 2016 (1701) to November 29, 2016 (1900).
LR RF Neural Network (A) LSTM (B) BN (C)
precision θ=0 0.56 0.51 0.54 0.61 0.50
recall θ=0 0.17 0.59 0.12 0.10 0.06
f1-score θ=0 0.26 0.55 0.20 0.17 0.11
support θ=0 114 114 114 114 114
precision θ=1 0.43 0.33 0.43 0.44 0.43
recall θ=1 0.83 0.26 0.86 0.92 0.92
f1-score θ=1 0.57 0.29 0.57 0.59 0.58
support θ=1 87 87 87 87 87
Table 8: Various estimations for training date(index): July 28, 2014 (1300) to July 12, 2016 (1800); and testing date(index): July 13, 2016 (1801) to April 21, 2017 (2000).
LR RF Neural Network (A) LSTM (B) BN (C)
precision θ=0 0.64 0.66 0.73 0.69 0.69
recall θ=0 0.60 0.71 0.26 0.25 0.18
f1-score θ=0 0.62 0.68 0.38 0.37 0.29
support θ=0 136 136 136 136 136
precision θ=1 0.26 0.26 0.34 0.33 0.33
recall θ=1 0.29 0.22 0.80 0.77 0.83
f1-score θ=1 0.28 0.24 0.48 0.46 0.47
support θ=1 65 65 65 65 65
Table 9: Various estimations for training date(index): May 13, 2015 (1500) to April 21, 2017 (2000); and testing date(index): April 24, 2017 (2001) to September 8, 2017 (2100).
LR RF Neural Network (A) LSTM (B) BN (C)
precision θ=0 0.75 0.77 0.65 0.81 0.72
recall θ=0 1.00 0.82 0.22 0.38 0.34
f1-score θ=0 0.86 0.79 0.33 0.52 0.46
support θ=0 76 76 76 76 76
precision θ=1 0.00 0.30 0.21 0.28 0.23
recall θ=1 0.00 0.24 0.64 0.72 0.60
f1-score θ=1 0.00 0.27 0.32 0.40 0.33
support θ=1 25 25 25 25 25
Table 10: Various estimations for training date(index): October 5, 2015 (1600) to September 8, 2017 (2100); and testing date(index): September 11, 2017 (2101) to February 1, 2018 (2200).
LR RF Neural Network (A) LSTM (B) BN (C)
precision θ=0 0.92 0.93 0.95 0.92 0.93
recall θ=0 1.00 0.96 0.39 0.76 0.67
f1-score θ=0 0.96 0.94 0.55 0.84 0.78
support θ=0 93 93 93 93 93
precision θ=1 0.00 0.20 0.10 0.08 0.09
recall θ=1 0.00 0.12 0.75 0.25 0.38
f1-score θ=1 0.00 0.15 0.17 0.12 0.14
support θ=1 8 8 8 8 8
Table 11: Various estimations for training date(index): February 29, 2016 (1700) to February 1, 2018 (2200); and testing date(index): February 2, 2018 (2201) to June 26, 2018 (2300).
LR RF Neural Network (A) LSTM (B) BN (C)
precision θ=0 0.94 0.93 0.96 0.95 0.94
recall θ=0 1.00 0.84 0.67 0.56 0.67
f1-score θ=0 0.97 0.88 0.79 0.70 0.79
support θ=0 95 95 95 95 95
precision θ=1 0.00 0.00 0.09 0.07 0.06
recall θ=1 0.00 0.00 0.50 0.50 0.33
f1-score θ=1 0.00 0.00 0.15 0.12 0.10
support θ=1 6 6 6 6 6
Table 12: Various estimations for training date(index): July 12, 2016 (1800) to June 26, 2018 (2300); and testing date(index): June 27, 2018 (2301) to November 14, 2018 (2400).
LR RF Neural Network (A) LSTM (B) BN (C)
precision θ=0 0.74 0.76 0.75 0.76 0.78
recall θ=0 1.00 0.99 0.99 0.87 0.79
f1-score θ=0 0.85 0.86 0.85 0.81 0.78
support θ=0 75 75 75 75 75
precision θ=1 0.00 0.67 0.50 0.38 0.36
recall θ=1 0.00 0.08 0.04 0.23 0.35
f1-score θ=1 0.00 0.14 0.07 0.29 0.35
support θ=1 26 26 26 26 26
Table 13: Various estimations for training date(index): July 12, 2016 (1800) to June 26, 2018 (2300); and testing date(index): June 27, 2018 (2301) to April 10, 2019 (2500).
LR RF Neural Network (A) LSTM (B) BN (C)
precision, θ=0 0.77 0.78 0.77 0.79 0.83
recall θ=0 1.00 0.96 0.92 0.92 0.75
f1-score θ=0 0.87 0.86 0.84 0.85 0.79
support θ=0 154 154 154 154 154
precision, θ=1 0.00 0.45 0.32 0.43 0.38
recall θ=1 0.00 0.11 0.13 0.21 0.49
f1-score θ=1 0.00 0.17 0.18 0.29 0.43
support θ=1 47 47 47 47 47

To make the BN-S model implementable for a long range, it is clear that a single Lévy subordinator is not effective. If a large fluctuation in the future can be apprehended from the historical data (i.e., θ=1) with the help of machine learning algorithms, we can “switch” the initial Lévy subordinator (Z) to the more intense Lévy subordinator (Z(b)) that corresponds to larger fluctuations. On the other hand if no big fluctuation in the future can be apprehended from the historical data (i.e., θ=0) with the help of machine learning algorithms, we can “switch” the Lévy subordinator Z(b) to Z. In this way, a single equation (2.6) can be used to describe the crude oil dynamics even for a longer time period.

It is clear from the various tables that the logistic regression is less efficient in detecting future big jumps (θ=1) based on the historical data. For most of the cases the neural network technique (A), LSTM (B), or the LSTM with a batch normalizer (C), work better than the random forest classifier. Also, if the algorithms are trained on more data points, the predictions for θ=1 are better. To keep the model simple, only two hidden layers are used. The results improve if the number of hidden layers is increased. Also, note that the softmax activation function in the output layers for (A), (B), or (C), in fact provides probabilities for θ=0 and θ=1. With appropriate scaling those probabilities can be used in lieu of (1-θ) and θ in (2.6).

Once we have a good estimation of the value of θ, we can implement that to (2.6). That would lead to one of two options: (1) if the initial description of the BN-S dynamics incorporates Z (or Z(b)) as the Lévy subordinator and θ=0 is established, we continue (or, update) the subordinator with Z; (2) if the initial description of the BN-S dynamics incorporates Z (or Z(b)) as the Lévy subordinator and θ=1 is established, we update (or, continue) the subordinator with Z(b). The machine learning algorithms can be performed dynamically in order to continue or update with the background driving Lévy process in the BN-S model.

As a result, the analysis shows that for crude oil price dynamics, the jump is not completely stochastic. There is a deterministic element (θ) in it that can be implemented to apply the existing models for an extended period of time. Thus the new model incorporates long term dependence without changing the tractability of the model. This model is more efficient, but at the same time has many fewer parameters than the superposition models.

4 Conclusion

We observe that a classical BN-S model may not appropriately represent crude oil price dynamics. In this paper, we implement various machine learning algorithms to determine the possibility of an upcoming large fluctuation in the crude oil price. Once those possibilities are obtained, the classical BN-S model is modified (or not, depending on the obtained possibilities) with respect to its background driving Lévy subordinator. This modification enables long range dependence in the new model without significantly changing the model. Also, this modification incorporates only one extra parameter (i.e., θ) compared to the classical model. It is shown in this paper that the parameter θ is deterministic and can be obtained from the empirical data using various machine learning techniques.

In this paper we implement machine learning algorithms to the empirical data in order to improve the mathematical model for commodity price dynamics. In a sequel of this work, we plan to implement this analysis for other financial time series. Also, we observe that the stochastic equation related to the volatility dynamics does not play a crucial role in the present analysis. The situation will be different and improved if it can be appropriately analyzed for an empirical data set.

References

  • [1]
  • [2] Abdullah S. N. & Zeng X. (2010), Machine learning approach for crude oil price prediction with Artificial Neural Networks-Quantitative (ANN-Q) model, The 2010 International Joint Conference on Neural Networks (IJCNN), doi: 10.1109/IJCNN.2010.5596602
  • [3] Arriojas M., Hu Y., Mohammed S-E. & Pap G. (2007), A Delayed Black and Scholes Formula, Stoch Anal Appl., 25, 471–492.
  • [4] Barndorff-Nielsen O. E. (2001), Superposition of Ornstein-Uhlenbeck Type Processes, Theory Probab. Appl., 45, 175-194.
  • [5] Bernard V. & Thomas J. (1989), Post-earnings-announcement drift: delayed price response or risk premium?, J. Account. Res., 27, 1-36.
  • [6] Barndorff-Nielsen O. E. & Shephard N.(2001), Non-Gaussian Ornstein-Uhlenbeck-based models and some of their uses in financial economics, J. R. Stat. Soc. Ser. B Stat. Methodol., 63, 167-241.
  • [7] Barndorff-Nielsen O. E. & Shephard N. (2001), Modelling by Lévy Processes for Financial Econometrics, In Lévy Processes : Theory and Applications (eds O. E. Barndorff-Nielsen, T. Mikosch & S. Resnick), 283-318, Birkhäuser.
  • [8] Barndorff-Nielsen O. E. , Jensen J. L. & Sørensen M. (1998), Some stationary processes in discrete and continuous time, Adv. in Appl. Probab., 30, 989-1007.
  • [9] Benth F. E., Karlsen K. H. & K. Reikvam (2003), Merton’s portfolio optimization problem in a Black and Scholes market with non-Gaussian stochastic volatility of Ornstein-Uhlenbeck type, Math. Finance, 13, 215-244.
  • [10] Black F. & Scholes M. (1973), The pricing of options and corporate liabilities, J. Political Econ., 81, 637-659.
  • [11] Booth G., Kallunki J., & Martikainen T. (1997), Delayed price response to the announcements of earnings and its components in Finland, European Account. Rev., 6, 377-392.
  • [12] Brown I., Funk J., & Sircar R. (2017), Oil Prices & Dynamic Games Under Stochastic Demand, Available at SSRN: https://ssrn.com/abstract=3047390orhttp://dx.doi.org/10.2139/ssrn.3047390.
  • [13] Chan P. & Sircar R. (2017), Fracking, Renewables, and Mean Field Games, SIAM Review, 59(3), 588-615.
  • [14] Chen Y., Kaijian H. & Tso G. K.F. (2017), Forecasting Crude Oil Prices: a Deep Learning based Model, Procedia Computer Science, 122, 300-307.
  • [15] Frey G., Manera M., Markandya A., & Scarpa E. (2009), Econometric Models for Oil Price Forecasting: A Critical Survey, CESifo Forum, ifo Institute - Leibniz Institute for Economic Research at the University of Munich, 10(1), 29-44.
  • [16] Grinblatt M. & Keloharju M. (2001), What makes investors trade?, J. Finance, 56, 589-616.
  • [17] He X. J. (2018), Crude Oil Prices Forecasting: Time Series vs. SVR Models, Journal of International Technology and Information Management, 27(2), 25-42.
  • [18] Habtemicael, S., Ghebremichael, M., & SenGupta, I. (2019), Volatility and Variance Swap Using Superposition of the Barndorff-Nielsen and Shephard type Lévy Processes, To appear in Sankhya B, https://doi.org/10.1007/s13571-017-0145-y.
  • [19] Issaka, A. & SenGupta, I. (2017), Analysis of variance based instruments for Ornstein–Uhlenbeck type models: swap and price index, Annals of Finance, 13(4), 401-434.
  • [20] Jiang, J. & Tian, W. (2018), Semi-nonparametric approximation and index options, Annals of Finance, in press, https://doi.org/10.1007/s10436-018-0341-4.
  • [21] Kulkarni, K.S. & Sabarwal, T. (2017), To what extent are investment bank-differentiating factors relevant for firms floating moderate-sized IPOs?, Annals of Finance, 3 (3), 297–327.
  • [22] Li X., Shang W., & Wang S. (2018), Text-based crude oil price forecasting: A deep learning approach, International Journal of Forecasting, Available online 30 October 2018, In Press.
  • [23] Nicolato E. & Venardos E. (2003), Option Pricing in Stochastic Volatility Models of the Ornstein-Uhlenbeck type, Math. Finance, 13, 445-466.
  • [24] Pasiouras, F., Gaganis, C. & Doumpos, M. (2007), A multicriteria discrimination approach for the credit rating of Asian banks, Annals of Finance, 3(3), 351-367.
  • [25] SenGupta I. (2016), Generalized BN-S stochastic volatility model for option pricing, International Journal of Theoretical and Applied Finance, 19(02), 1650014 (23 pages).
  • [26] SenGupta I., Wilson W., & Nganje W. (2019), Barndorff-Nielsen and Shephard model: oil hedging with variance swap and option, Mathematics and Financial Economics, 13(2), 209-226.
  • [27] Zhao Y., Li J., & Yu, L. (2017), A deep learning ensemble approach for crude oil price forecasting, Energy Economics, 66(C), 9-16.