DBN based SD-ARX model for nonlinear time series prediction and analysis

Xu, Wenquan; Peng, Hui; Tian, Xiaoying; Peng, Xiaoyan

doi:10.1007/s10489-020-01804-2

DBN based SD-ARX model for nonlinear time series prediction and analysis

Published: 26 July 2020

Volume 50, pages 4586–4601, (2020)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Wenquan Xu ORCID: orcid.org/0000-0002-8321-3225^1,2,
Hui Peng¹,
Xiaoying Tian¹ &
…
Xiaoyan Peng³

411 Accesses
9 Citations
Explore all metrics

Abstract

One of the main purposes of nonlinear system modeling is to design model-based controllers such as model predictive control (MPC). A group of deep belief networks (DBNs) are used to approximate the function type coefficients of a state dependent autoregressive model with exogenous variables (SD-ARX), which can represent nonlinear dynamics, and thus a DBN-based state-dependent ARX (DBN-ARX) model is obtained in this paper. The DBN-ARX model has the function approximation ability of single DBN model and the nonlinear description advantage of SD-ARX model. All parameters of the DBN-ARX model are estimated by the pre-training and fine-tuning strategies and the stability condition of the model are also discussed. The proposed DBN-ARX model is a pseudo-linear ARX model identified offline, and its function type coefficients are composed of the operating-point dependent DBNs. The usefulness of the DBN-ARX model is illustrated by modeling a continuously stirred tank reactor (CSTR) time series, Box and Jenkins data, a nonlinear process and a water tank system. The four experimental results show that the one-step-ahead and multi-step-ahead prediction accuracy of the proposed DBN-ARX model is improved comparing with the modeling results of several existing models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Hybrid Modeling Method Based on Linear AR and Nonlinear DBN-AR Model for Time Series Forecasting

Article 29 October 2021

Stability Analysis of Deep Belief Network: Based SD-AR Model for Nonlinear Time Series

Article Open access 23 February 2024

A hybrid modelling method for time series forecasting based on a linear regression model and deep learning

Article 20 February 2019

References

Gu B, Gupta YP (2008) Control of nonlinear processes by using linear model predictive control algorithms. ISA Transactions 47(2):211–216
Article Google Scholar
Huusom JK, Poulsen NK, Rgensen SB, Rgensen JB (2012) Tuning siso offset-free model predictive control based on ARX models. Journal of Process Control 22(10):1997–2007
Article Google Scholar
Novák, J., Chalupa, P., & Bobál, V. (2011). MIMO model predictive control with local linear models. Wseas International Conference on Automatic Control
Hosen MA, Hussain MA, Mjalli FS (2011) Control of polystyrene batch reactors using neural network based model predictive control (NNMPC): an experimental investigation. Control Engineering Practice 19(5):454–467
Article Google Scholar
Ekman M (2008) Bilinear black-box identification and MPC of the activated sludge process. Journal of Process Control 18(7–8):643–653
Article Google Scholar
Dongli W, Yan Z, Xiaoyang H (2007) Radial basis function neural network-based model predictive control for freeway traffic systems. International Journal of Intelligent Systems Technologies and Applications 2(4):370
Article Google Scholar
Aggelogiannaki E, Sarimveis H (2008) Nonlinear model predictive control for distributed parameter systems using data driven artificial neural network models. Computers and Chemical Engineering 32(6):1225–1237
Article Google Scholar
Han HG, Zhang L, Hou Y, Qiao JF (2015) Nonlinear model predictive control based on a self-organizing recurrent neural network. IEEE Transactions on Neural Networks and Learning Systems 27(2):402
Article MathSciNet Google Scholar
Rezaee B, Zarandi MHF (2010) Data-driven fuzzy modeling for takagi–sugeno–kang fuzzy system. Information Sciences 180(2):241–255
Article Google Scholar
Box GEP, Jenkins GM (1971) Time series analysis, forecasting and control. Journal of the American Statistical Association 134(3)
Gaweda AE, Zurada JM (2003) Data-driven linguistic modeling using relational fuzzy rules. Fuzzy Systems IEEE Transactions on 11(1):121–134
Article Google Scholar
Kim E, Park M, Ji S, Park M (1997) A new approach to fuzzy modeling. IEEE Trans Fuzzy Syst 5(3):328–337
Article Google Scholar
Kim E, Park M, Kim S, Park M (1998) A transformed input-domain approach to fuzzy modeling. Fuzzy Systems IEEE Transactions on 6(4):596–604
Article Google Scholar
Sugeno M, Tanaka K (1991) Successive identification of a fuzzy model and its applications to prediction of a complex system. Fuzzy Sets & Systems 42(3):315–334
Article MathSciNet Google Scholar
Wang H, Kwong S, Jin Y, Wei W, Man KF (2005) Agent-based evolutionary approach for interpretable rule-based knowledge extraction. IEEE Transactions on Systems Man & Cybernetics Part C 35(2):143–155
Article Google Scholar
Zhao J, Mo H, Deng Y (2020) An efficient network method for time series forecasting based on the DC algorithm and visibility relation. IEEE Access 8(1). https://doi.org/10.1109/ACCESS.2020.2964067
Liu F, Deng Y (2019) A fast algorithm for network forecasting time series. IEEE Access 7(1):102554–102560
Article Google Scholar
Xu P, Zhang R, Deng Y (2018) A novel visibility graph transformation of time series into weighted networks. Chaos, Solitons & Fractals 117:201–208
Article MathSciNet Google Scholar
Maciel L, Ballini R, Gomide F (2016) Evolving granular analytics for interval time series forecasting. Granular Computing 1(4):213–224
Article Google Scholar
Zhang J, Chin KS, Lawryńczuk M (2018) Nonlinear model predictive control based on piecewise linear hammerstein models. Nonlinear Dynamics 92:1001–1021
Article Google Scholar
Korda M, Mezić I (2016) Linear predictors for nonlinear dynamical systems: koopman operator meets model predictive control. Automatica 93
Ania LC, Agamennoni OE, Figueroa JL (2003) A nonlinear model predictive control system based on Wiener piecewise linear models. Journal of Process Control 13(7):655–666
Article Google Scholar
Zhao M, Li N, Li SY (2009) Min-Max model predictive control for constrained nonlinear systems via multiple LPV embeddings. Science in China Series F (Information Science) 52(7):1129–1135
Article MathSciNet Google Scholar
Ding B, Tao Z (2014) A synthesis approach for output feedback robust model predictive control based-on input–output model. Journal of Process Control 24(3):60–72
Article Google Scholar
Xu W, Peng H, Zeng X, Zhou F, Tian X, Peng X (2019) Deep belief network-based AR model for nonlinear time series forecasting. Applied Soft Computing 77:605–621
Article Google Scholar
Xu W, Peng H, Zeng X, Zhou F, Tian X, Peng X (2019) A hybrid modelling method for time series forecasting based on a linear regression model and deep learning. Applied Intelligence 49(8):3002–3015
Article Google Scholar
Li W, Szidarovszky F (1999) An elementary result in the stability theory of time-invariant nonlinear discrete dynamical systems. Applied Mathematics and Computation 102:35–49
Article MathSciNet Google Scholar
Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science. 313(5786):504–507
Article MathSciNet Google Scholar
Cervantes AL, Agamennoni OE, Figueroa JL (2003) A nonlinear model predictive control system based on Wiener piecewise linear models. Journal of Process Control 13:655–666
Article Google Scholar
Man, H. (2014). Research on model identification and nonlinear predictive control methods of CSTR process, Ph.D. dissertation, Univ. Dalian University of Technology, Dalian, China
Gan M, Li HX, Peng H (2017) A variable projection approach for efficient estimation of RBF-ARX model. IEEE Transactions on Cybernetics 45(3):462–471
Article Google Scholar

Download references

Acknowledgments

The authors would like to thank the editors and referees for their valuable comments and suggestions, which substantially improved the original manuscript. The work presented in this paper was supported by the National Natural Science Foundation of China (Grant No. 61773402, Grant No. 51575167 and Grant No. 61540037) and the Anhui Provincial Natural Science Foundation(2008085MF197).

Author information

Authors and Affiliations

School of Automation, Central South University, Changsha, 410083, Hunan, China
Wenquan Xu, Hui Peng & Xiaoying Tian
School of Electronic Engineering and Intelligent Manufacturing, Anqing Normal University, Anqing, 246001, Anhui, China
Wenquan Xu
College of Mechanical and Vehicle Engineering, Hunan University, Changsha, 410082, Hunan, China
Xiaoyan Peng

Authors

Wenquan Xu
View author publications
You can also search for this author in PubMed Google Scholar
Hui Peng
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoying Tian
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoyan Peng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hui Peng.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Fine tuning process of the proposed DBN-ARX model

Eq. (13) is used to calculate the objective function in the fine-tuning stage, and the objective function is defined as follows

$$ E(t)=\frac{1}{2}{\xi}^2(t)=\frac{1}{2}{\left(y(t)-\hat{y}(t)\right)}^2=\frac{1}{2}{\left(y(t)-{\hat{\phi}}_0\left(\varGamma \left(t-1\right)\right)-\sum \limits_{m=1}^{n_y}{\hat{\phi}}_{y,m}\left(\varGamma \left(t-1\right)\right)y\left(t-m\right)-\sum \limits_{n=1}^{n_u}{\hat{\phi}}_{u,n}\left(\varGamma \left(t-1\right)\right)u\left(t-n\right)\right)}^2,t={n}_y+1,{n}_y+2,\cdots, N $$

(22)

where y(t) and $ \hat{y}(t) $ are actual and predicted values, respectively.

The traditional gradient decent method is used to fine tune the parameters of the DBN-ARX model. For the neuron in the $ {N}_r^{(j)} $-th layer, Eq. (14) is used to compute the gradient for parameter updating.

$$ \frac{\partial E(t)}{\partial {w}_{1,{n}_{N_r^{(j)}-1}^{(j)},j}^{\left({N}_r^{(j)}\right)}}=\frac{\partial E(t)}{\partial \xi (t)}\frac{\partial \xi (t)}{\partial \hat{y}(t)}\frac{\partial \hat{y}(t)}{\partial {\hat{\phi}}_j(t)}\frac{\partial {\hat{\phi}}_j(t)}{\partial {u}_{1,j}^{\left({N}_r^{(j)}\right)}(t)}\frac{\partial {u}_{1,j}^{\left({N}_r^{(j)}\right)}(t)}{\partial {w}_{1,{n}_{N_r^{(j)}-1}^{(j)},j}^{\left({N}_r^{(j)}\right)}}\kern4em =-\xi (t)a\left(t-j\right){\varphi}^{\prime}\left({u}_{1,j}^{\left({N}_r^{(j)}\right)}(t)\right){h}_{n_{N_r^{(j)}-1}^{(j)},j}^{\left({N}_r^{(j)}-1\right)}(t)={\varphi}^{\prime}\left({u}_{1,j}^{\left({N}_r^{(j)}\right)}(t)\right)c\left(t-j\right){h}_{n_{N_r^{(j)}-1}^{(j)},j}^{\left({N}_r^{(j)}-1\right)}(t) $$

(23)

where

$$ {\displaystyle \begin{array}{c}{\hat{\phi}}_0(t)={\hat{\phi}}_0\left(\varGamma \left(t-1\right)\right);\\ {}{\hat{\phi}}_1(t)={\hat{\phi}}_{y,1}\left(\varGamma \left(t-1\right)\right);\\ {}\vdots \\ {}{\hat{\phi}}_{n_y}(t)={\hat{\phi}}_{y,{n}_y}\left(\varGamma \left(t-1\right)\right);\\ {}{\hat{\phi}}_{n_y+1}(t)={\hat{\phi}}_{u,1}\left(\varGamma \left(t-1\right)\right);\\ {}\vdots \\ {}{\hat{\phi}}_{n_y+{n}_u}(t)={\hat{\phi}}_{u,{n}_u}\left(\varGamma \left(t-1\right)\right);\end{array}} $$

$$ a\left(t-j\right)=y\left(t-j\right),j=1,2,\cdots, {n}_y+{n}_u;a(t)=1 $$

$$ c\left(t-j\right)=-\xi (t)a\left(t-j\right),j=0,1,2,\cdots, \left({n}_y+{n}_u\right) $$

and φ^′(u) is the derivative of φ(u) with respect to u. Let

$$ {\delta}_{1,j}^{\left({N}_r^{(j)}\right)}(t)={\varphi}^{\prime}\left({u}_{1,j}^{\left({N}_r^{(j)}\right)}\right)c\left(t-j\right) $$

(24)

According to the chain rule of calculus, the local gradient of the j ‐ th DBN module in the last layer can be redefined by

$$ \frac{\partial E(t)}{\partial {w}_{1,{n}_{N_r^{(j)}-1}^{(j)},j}^{\left({N}_r^{(j)}\right)}}={\delta}_{1,j}^{\left({N}_r^{(j)}\right)}(t){h}_{n_{N_r^{(j)}-1}^{(j)},j}^{\left({N}_r^{(j)}-1\right)}(t) $$

(25)

Similarly, we have

$$ \frac{\partial E(t)}{\partial {b}_{1,j}^{\left({N}_r^{(j)}\right)}}=\frac{\partial E(t)}{\partial \xi (t)}\frac{\partial \xi (t)}{\partial \hat{y}(t)}\frac{\partial \hat{y}(t)}{\partial {\hat{\phi}}_j(t)}\frac{\partial {\hat{\phi}}_j(t)}{\partial {u}_{1,j}^{\left({N}_r^{(j)}\right)}(t)}\frac{\partial {u}_{1,j}^{\left({N}_r^{(j)}\right)}(t)}{\partial {b}_{1,j}^{\left({N}_r^{(j)}\right)}}\kern2.75em =-\xi (t)a\left(t-j\right){\varphi}^{\prime}\left({u}_{1,j}^{\left({N}_r^{(j)}\right)}(t)\right)={\varphi}^{\prime}\left({u}_{1,j}^{\left({N}_r^{(j)}\right)}\right)c\left(t-j\right)={\delta}_{1,j}^{\left({N}_r^{(j)}\right)}(t) $$

(26)

With regard to neuron $ {n}_{N_r^{(j)}-1}^{(j)}\in \left\{1,2,\cdots, {Q}_{N_r^{(j)}-1}^{(j)}\right\} $ in the ($ {N}_r^{(j)}-1 $)-th hidden layer, Eq. (27) can be used to compute the gradient for parameter updating.

$$ \frac{\partial E(t)}{\partial {w}_{n_{N_r^{(j)}-1}^{(j)},{n}_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-1\right)}}=\frac{\partial E(t)}{\partial \xi (t)}\frac{\partial \xi (t)}{\partial \hat{y}(t)}\frac{\partial \hat{y}(t)}{\partial {\hat{\phi}}_j(t)}\frac{\partial {\hat{\phi}}_j(t)}{\partial {u}_{1,j}^{\left({N}_r^{(j)}\right)}(t)}\frac{\partial {u}_{1,j}^{\left({N}_r^{(j)}\right)}(t)}{\partial {h}_{n_{N_r^{(j)}-1}^{(j)},j}^{\left({N}_r^{(j)}-1\right)}(t)}\frac{\partial {h}_{n_{N_r^{(j)}-1}^{(j)},j}^{\left({N}_r^{(j)}-1\right)}(t)}{\partial {u}_{n_{N_r^{(j)}-1}^{(j)},j}^{\left({N}_r^{(j)}-1\right)}(t)}\frac{\partial {u}_{n_{N_r^{(j)}-1}^{(j)},j}^{\left({N}_r^{(j)}-1\right)}(t)}{\partial {w}_{n_{N_r^{(j)}-1}^{(j)},{n}_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-1\right)}}=-\xi (t)a\left(t-j\right){\varphi}^{\prime}\left({u}_{1,j}^{\left({N}_r^{(j)}\right)}(t)\right){w}_{1,{n}_{N_r^{(j)}-1}^{(j)},j}^{\left({N}_r^{(j)}\right)}{\varphi}^{\prime}\left({u}_{n_{N_r^{(j)}-1}^{(j)},j}^{\left({N}_r^{(j)}-1\right)}(t)\right){h}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}(t)={\varphi}^{\prime}\left({u}_{n_{N_r^{(j)}-1}^{(j)},j}^{\left({N}_r^{(j)}-1\right)}(t)\right){w}_{1,{n}_{N_r^{(j)}-1}^{(j)},j}^{\left({N}_r^{(j)}\right)}{\varphi}^{\prime}\left({u}_{1,j}^{\left({N}_r^{(j)}\right)}(t)\right)c\left(t-j\right){h}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}(t)={\varphi}^{\prime}\left({u}_{n_{N_r^{(j)}-1}^{(j)},j}^{\left({N}_r^{(j)}-1\right)}(t)\right){w}_{1,{n}_{N_r^{(j)}-1}^{(j)},j}^{\left({N}_r^{(j)}\right)}{\delta}_{1,j}^{\left({N}_r^{(j)}\right)}(t){h}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}(t) $$

(27)

Let

$$ {\delta}_{n_{N_r^{(j)}-1}^{(j)},j}^{\left({N}_r^{(j)}-1\right)}(t)={\varphi}^{\prime}\left({u}_{n_{N_r^{(j)}-1}^{(j)},j}^{\left({N}_r^{(j)}-1\right)}(t)\right){w}_{1,{n}_{N_r^{(j)}-1}^{(j)},j}^{\left({N}_r^{(j)}\right)}{\delta}_{1,j}^{\left({N}_r^{(j)}\right)}(t) $$

(28)

then Eq. (27) becomes

$$ \frac{\partial E(t)}{\partial {w}_{n_{N_r^{(j)}-1}^{(j)},{n}_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-1\right)}}={\delta}_{n_{N_r^{(j)}-1}^{(j)},j}^{\left({N}_r^{(j)}-1\right)}(t){h}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}(t) $$

(29)

and

$$ {\displaystyle \begin{array}{c}\frac{\partial E(t)}{\partial {b}_{n_{N_r^{(j)}-1}^{(j)},j}^{\left({N}_r^{(j)}-1\right)}}=\frac{\partial E(t)}{\partial \xi (t)}\frac{\partial \xi (t)}{\partial \hat{y}(t)}\frac{\partial \hat{y}(t)}{\partial {\hat{\phi}}_j(t)}\frac{\partial {\hat{\phi}}_j(t)}{\partial {u}_{1,j}^{\left({N}_r^{(j)}\right)}(t)}\frac{\partial {u}_{1,j}^{\left({N}_r^{(j)}\right)}(t)}{\partial {h}_{n_{N_r^{(j)}-1}^{(j)},j}^{\left({N}_r^{(j)}-1\right)}(t)}\frac{\partial {h}_{n_{N_r^{(j)}-1}^{(j)},j}^{\left({N}_r^{(j)}-1\right)}(t)}{\partial {u}_{n_{N_r^{(j)}-1}^{(j)},j}^{\left({N}_r^{(j)}-1\right)}(t)}\frac{\partial {u}_{n_{N_r^{(j)}-1}^{(j)},j}^{\left({N}_r^{(j)}-1\right)}(t)}{\partial {b}_{n_{N_r^{(j)}-1}^{(j)},j}^{\left({N}_r^{(j)}-1\right)}}\\ {}=-\xi (t)a\left(t-j\right){\varphi}^{\prime}\left({u}_{1,j}^{\left({N}_r^{(j)}\right)}(t)\right){w}_{1,{n}_{N_r^{(j)}-1}^{(j)},j}^{\left({N}_r^{(j)}\right)}{\varphi}^{\prime}\left({u}_{n_{N_r^{(j)}-1}^{(j)},j}^{\left({N}_r^{(j)}-1\right)}(t)\right)\\ {}={\varphi}^{\prime}\left({u}_{n_{N_r^{(j)}-1}^{(j)},j}^{\left({N}_r^{(j)}-1\right)}(t)\right){w}_{1,{n}_{N_r^{(j)}-1}^{(j)},j}^{\left({N}_r^{(j)}\right)}{\varphi}^{\prime}\left({u}_{1,j}^{\left({N}_r^{(j)}\right)}(t)\right)c\left(t-j\right)\\ {}={\delta}_{n_{N_r^{(j)}-1}^{(j)},j}^{\left({N}_r^{(j)}-1\right)}(t)\end{array}} $$

(30)

With regard to neuron $ {n}_{N_r^{(j)}-2}^{(j)}\in \left\{1,2,\cdots, {Q}_{N_r^{(j)}-2}^{(j)}\right\} $ in the ($ {N}_r^{(j)}-2 $)-th hidden layer, Eq. (31) can be used to compute the gradient for parameter updating. Eq. (31) is given as follows.

$$ \frac{\partial E(t)}{\partial {w}_{n_{N_r^{(j)}-2}^{(j)},{n}_{N_r^{(j)}-3}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}}=\frac{\partial E(t)}{\partial \xi (t)}\frac{\partial \xi (t)}{\partial \hat{y}(t)}\frac{\partial \hat{y}(t)}{\partial {\hat{\phi}}_j(t)}\left(\frac{\partial {\hat{\phi}}_j(t)}{\partial {h}_{1,j}^{\left({N}_r^{(j)}-1\right)}(t)},\frac{\partial {h}_{1,j}^{\left({N}_r^{(j)}-1\right)}(t)}{\partial {h}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}(t)},\frac{\partial {h}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}(t)}{\partial {u}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}(t)},\frac{\partial {u}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}(t)}{\partial {w}_{n_{N_r^{(j)}-2}^{(j)},{n}_{N_r^{(j)}-3}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}},+,\frac{\partial {\hat{\phi}}_j(t)}{\partial {h}_{2,j}^{\left({N}_r^{(j)}-1\right)}(t)},\frac{\partial {h}_{2,j}^{\left({N}_r^{(j)}-1\right)}(t)}{\partial {h}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}(t)},\frac{\partial {h}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}(t)}{\partial {u}_{n_{N_r^{(j)}-2}^{(j)},j}^{N_r^j-2}(t)},\frac{\partial {u}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}(t)}{\partial {w}_{n_{N_r^{(j)}-2}^{(j)},{n}_{N_r^{(j)}-3}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}},+,\cdots, +,\frac{\partial {\hat{\phi}}_j(t)}{\partial {h}_{Q_{N_r^{(j)}-1}^{(j)},j}^{\left({N}_r^{(j)}-1\right)}(t)},\frac{\partial {h}_{Q_{N_r^{(j)}-1}^{(j)},j}^{\left({N}_r^{(j)}-1\right)}(t)}{\partial {h}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}(t)},\frac{\partial {h}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}(t)}{\partial {u}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}(t)},\frac{\partial {u}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}(t)}{\partial {w}_{n_{N_r^{(j)}-2}^{(j)},{n}_{N_r^{(j)}-3}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}}\right)=-\xi (t)a\left(t-j\right)\left({\varphi}^{\prime },\left({u}_{1,j}^{\left({N}_r^{(j)}\right)}\right),{w}_{1,1,j}^{\left({N}_r^{(j)}\right)},{\varphi}^{\prime },\left({u}_{1,j}^{\left({N}_r^{(j)}-1\right)}\right),{w}_{1,{n}_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-1\right)},{\varphi}^{\prime },\left({u}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}\right),{h}_{n_{N_r^{(j)}-3}^{(j)},j}^{\left({N}_r^{(j)}-3\right)},(t),+,{\varphi}^{\prime },\left({u}_{1,j}^{\left({N}_r^{(j)}\right)}\right),{w}_{1,2,j}^{\left({N}_r^{(j)}\right)},{\varphi}^{\prime },\left({u}_{2,j}^{\left({N}_r^{(j)}-1\right)}\right),{w}_{2,{n}_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-1\right)},{\varphi}^{\prime },\left({u}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}\right),{h}_{n_{N_r^{(j)}-3}^{(j)},j}^{\left({N}_r^{(j)}-3\right)},(t),+,\cdots, +,{\varphi}^{\prime },\left({u}_{1,j}^{\left({N}_r^{(j)}\right)}\right),{w}_{1,{Q}_{N_r^{(j)}-1}^{(j)},j}^{\left({N}_r^{(j)}\right)},{\varphi}^{\prime },\left({u}_{Q_{N_r^{(j)}-1}^{(j)},j}^{\left({N}_r^{(j)}-1\right)}\right),{w}_{Q_{N_r^{(j)}-1}^{(j)},{n}_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-1\right)},{\varphi}^{\prime },\left({u}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}\right),{h}_{n_{N_r^{(j)}-3}^{(j)},j}^{\left({N}_r^{(j)}-3\right)},(t)\right)=\sum \limits_{v=1}^{Q_{N_r^{(j)}-1}^{(j)}}{\varphi}^{\prime}\left({u}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}\right){w}_{v,{n}_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-1\right)}{\varphi}^{\prime}\left({u}_{v,j}^{\left({N}_r^{(j)}-1\right)}\right){w}_{1,v,j}^{\left({N}_r^{(j)}\right)}{\varphi}^{\prime}\left({u}_{1,j}^{\left({N}_r^{(j)}\right)}\right)c\left(t-j\right){h}_{n_{N_r^{(j)}-3}^{(j)},j}^{\left({N}_r^{(j)}-3\right)}(t)=\sum \limits_{v=1}^{Q_{N_r^{(j)}-1}^{(j)}}{\varphi}^{\prime}\left({u}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}\right){w}_{v,{n}_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-1\right)}{\delta}_{v,j}^{\left({N}_r^{(j)}-1\right)}(t){h}_{n_{N_r^{(j)}-3}^{(j)},j}^{\left({N}_r^{(j)}-3\right)}(t) $$

(31)

Let

$$ {\delta}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}(t)=\sum \limits_{v=1}^{Q_{N_r^{(j)}-1}^{(j)}}{\varphi}^{\prime}\left({u}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}\right){w}_{v,{n}_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-1\right)}{\delta}_{v,j}^{\left({N}_r^{(j)}-1\right)}(t) $$

(32)

then formula (31) becomes

$$ \frac{\partial E(t)}{\partial {w}_{n_{N_r^{(j)}-2}^{(j)},{n}_{N_r^{(j)}-3}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}}={\delta}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}(t){h}_{n_{N_r^{(j)}-3}^{(j)},j}^{\left({N}_r^{(j)}-3\right)}(t) $$

(33)

And similarly, we have

$$ \frac{\partial E(t)}{\partial {b}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}}=\frac{\partial E(t)}{\partial \xi (t)}\frac{\partial \xi (t)}{\partial \hat{y}(t)}\frac{\partial \hat{y}(t)}{\partial {\hat{\phi}}_j(t)}\left(\frac{\partial {\hat{\phi}}_j(t)}{\partial {h}_{1,j}^{\left({N}_r^{(j)}-1\right)}(t)},\frac{\partial {h}_{1,j}^{\left({N}_r^{(j)}-1\right)}(t)}{\partial {h}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}(t)},\frac{\partial {h}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}(t)}{\partial {u}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}(t)},\frac{\partial {u}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}(t)}{\partial {b}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}},+,\frac{\partial {\hat{\phi}}_j(t)}{\partial {h}_{2,j}^{\left({N}_r^{(j)}-1\right)}(t)},\frac{\partial {h}_{2,j}^{\left({N}_r^{(j)}-1\right)}(t)}{\partial {h}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}(t)},\frac{\partial {h}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}(t)}{\partial {u}_{n_{N_r^{(j)}-2}^{(j)},j}^{N_r^j-2}(t)},\frac{\partial {u}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}(t)}{\partial {b}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}},+,\cdots, +,\frac{\partial {\hat{\phi}}_j(t)}{\partial {h}_{Q_{N_r^{(j)}-1}^{(j)},j}^{\left({N}_r^{(j)}-1\right)}(t)},\frac{\partial {h}_{Q_{N_r^{(j)}-1}^{(j)},j}^{\left({N}_r^{(j)}-1\right)}(t)}{\partial {h}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}(t)},\frac{\partial {h}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}(t)}{\partial {u}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}(t)},\frac{\partial {u}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}(t)}{\partial {b}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}}\right)=-\xi (t)a\left(t-j\right)\left({\varphi}^{\prime },\left({u}_{1,j}^{\left({N}_r^{(j)}\right)}\right),{w}_{1,1,j}^{\left({N}_r^{(j)}\right)},{\varphi}^{\prime },\left({u}_{1,j}^{\left({N}_r^{(j)}-1\right)}\right),{w}_{1,{n}_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-1\right)},{\varphi}^{\prime },\left({u}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}\right),+,{\varphi}^{\prime },\left({u}_{1,j}^{\left({N}_r^{(j)}\right)}\right),{w}_{1,2,j}^{\left({N}_r^{(j)}\right)},{\varphi}^{\prime },\left({u}_{2,j}^{\left({N}_r^{(j)}-1\right)}\right),{w}_{2,{n}_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-1\right)},{\varphi}^{\prime },\left({u}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}\right),+,\cdots, +,{\varphi}^{\prime },\left({u}_{1,j}^{\left({N}_r^{(j)}\right)}\right),{w}_{1,{Q}_{N_r^{(j)}-1}^{(j)},j}^{\left({N}_r^{(j)}\right)},{\varphi}^{\prime },\left({u}_{Q_{N_r^{(j)}-1}^{(j)},j}^{\left({N}_r^{(j)}-1\right)}\right),{w}_{Q_{N_r^{(j)}-1}^{(j)},{n}_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-1\right)},{\varphi}^{\prime },\left({u}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}\right)\right)=\sum \limits_{v=1}^{Q_{N_r^{(j)}-1}^{(j)}}{\varphi}^{\prime}\left({u}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}\right){w}_{v,{n}_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-1\right)}{\varphi}^{\prime}\left({u}_{v,j}^{\left({N}_r^{(j)}-1\right)}\right){w}_{1,v,j}^{\left({N}_r^{(j)}\right)}{\varphi}^{\prime}\left({u}_{1,j}^{\left({N}_r^{(j)}\right)}\right)c\left(t-j\right)=\sum \limits_{v=1}^{Q_{N_r^{(j)}-1}^{(j)}}{\varphi}^{\prime}\left({u}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}\right){w}_{v,{n}_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-1\right)}{\delta}_{v,j}^{\left({N}_r^{(j)}-1\right)}(t)={\delta}_{n_{N_r^{(j)}-2}^{(j)},j}^{\left({N}_r^{(j)}-2\right)}(t) $$

(34)

Therefore, as for the j ‐ th DBN module, according to the derivation process above, Eq. (35) is used to calculate the local gradient of each neuron of layer ℓ_j.

$$ {\delta}_{n_{\ell_j}^{(j)},j}^{\left({\ell}_j\right)}(t)=\sum \limits_{v=1}^{Q_{\ell_j+1}^{(j)}}{\varphi}^{\prime}\left({u}_{n_{\ell_j}^{(j)},j}^{\left({\ell}_j\right)}\right){w}_{v,{n}_{\ell_j}^{(j)},j}^{\left({\ell}_j+1\right)}{\delta}_{v,j}^{\left({\ell}_j+1\right)}(t),{\ell}_j\in \left\{1,2,\cdots, {N}_r^{(j)}-2\right\}, $$

(35)

The updated gradients to the weight and corresponding bias can be calculated by Eq. (36) and Eq. (37).

$$ \frac{\partial E(t)}{\partial {w}_{n_{\ell_j}^{(j)},{n}_{\ell_j-1}^{(j)},j}^{\left({\ell}_j\right)}}={\delta}_{n_{\ell_j}^{(j)},j}^{\left({\ell}_j\right)}(t){h}_{n_{\ell_j-1}^{(j)},j}^{\left({\ell}_j-1\right)}(t) $$

(36)

$$ \frac{\partial E(t)}{\partial {b}_{n_{\ell_j}^{(j)},j}^{\left({\ell}_j\right)}}={\delta}_{n_{\ell_j}^{(j)},j}^{\left({\ell}_j\right)}(t) $$

(37)

The correction $ \varDelta {w}_{n_L^{(j)},{n}_{L-1}^{(j)},j}^{(L)} $ applied to $ {w}_{n_L^{(j)},{n}_{L-1}^{(j)},j}^{(L)} $ is defined by the delta rule, Accordingly, the use of Eqs. (24–26), Eqs. (28–30) and Eqs. (32–37) yields

$$ \varDelta {w}_{n_L^{(j)},{n}_{L-1}^{(j)},j}^{(L)}=-\eta \frac{\partial E(t)}{\partial {w}_{n_L^{(j)},{n}_{L-1}^{(j)},j}^{(L)}}=-\eta {\delta}_{n_L^{(j)},j}^{(L)}{h}_{n_{L-1}^{(j)},j}^{\left(L-1\right)}(t) $$

(38)

$$ \varDelta {b}_{n_L^{(j)},j}^{(L)}=-\eta \frac{\partial E(t)}{\partial {b}_{n_L^{(j)},j}^{(L)}}=-\eta {\delta}_{n_L^{(j)},j}^{(L)} $$

(39)

where $ L\in \left\{1,2,\cdots, {N}_r^{(j)}-1,{N}_r^{(j)}\right\} $ and η > 0 is the pre-determined learning rate, and the parameters of weight and bias are calculated according to the following rules

$$ \left\{\begin{array}{c}{w}_{n_L^{(j)},{n}_{L-1}^{(j)},j}^{(L)}\Leftarrow {w}_{n_L^{(j)},{n}_{L-1}^{(j)},j}^{(L)}+\varDelta {w}_{n_L^{(j)},{n}_{L-1}^{(j)},j}^{(L)}\\ {}{b}_{n_L^{(j)},j}^{(L)}\Leftarrow {b}_{n_L^{(j)},j}^{(L)}+\varDelta {b}_{n_L^{(j)},j}^{(L)}\end{array}\right. $$

(40)

where the initial value of weight $ {w}_{n_L^{(j)},{n}_{L-1}^{(j)},j}^{(L)} $ and bias $ {b}_{n_L^{(j)},j}^{(L)} $ are obtained in the pre-training stage. In addition, to slow down the parameters convergence speed and oscillation, the momentum constant is added in Eq. (40), and finally we use the following parameter updating strategy

$$ \left\{\begin{array}{c}{w}_{n_L^{(j)},{n}_{L-1}^{(j)},j}^{(L)}(k)\Leftarrow {w}_{n_L^{(j)},{n}_{L-1}^{(j)},j}^{(L)}\left(k-1\right)+\varDelta {w}_{n_L^{(j)},{n}_{L-1}^{(j)},j}^{(L)}(k)+\alpha \left({w}_{n_L^{(j)},{n}_{L-1}^{(j)},j}^{(L)}\left(k-1\right)-{w}_{n_L^{(j)},{n}_{L-1}^{(j)},j}^{(L)}\left(k-2\right)\right)\\ {}{b}_{n_L^{(j)},j}^{(L)}(k)\Leftarrow {b}_{n_L^{(j)},j}^{(L)}\left(k-1\right)+\varDelta {b}_{n_L^{(j)},j}^{(L)}(k)+\alpha \left({b}_{n_L^{(j)},j}^{(L)}\left(k-1\right)-{b}_{n_L^{(j)},j}^{(L)}\left(k-2\right)\right)\end{array}\right. $$

(41)

where k is the number of updating iterations, α ∈ [0, 1)is a pre-determined momentum constant.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xu, W., Peng, H., Tian, X. et al. DBN based SD-ARX model for nonlinear time series prediction and analysis. Appl Intell 50, 4586–4601 (2020). https://doi.org/10.1007/s10489-020-01804-2

Download citation

Published: 26 July 2020
Issue Date: December 2020
DOI: https://doi.org/10.1007/s10489-020-01804-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DBN based SD-ARX model for nonlinear time series prediction and analysis

Abstract

Access this article

Similar content being viewed by others

A Hybrid Modeling Method Based on Linear AR and Nonlinear DBN-AR Model for Time Series Forecasting

Stability Analysis of Deep Belief Network: Based SD-AR Model for Nonlinear Time Series

A hybrid modelling method for time series forecasting based on a linear regression model and deep learning

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Fine tuning process of the proposed DBN-ARX model

Rights and permissions

About this article

Cite this article

Keywords

Navigation

DBN based SD-ARX model for nonlinear time series prediction and analysis

Abstract

Access this article

Similar content being viewed by others

A Hybrid Modeling Method Based on Linear AR and Nonlinear DBN-AR Model for Time Series Forecasting

Stability Analysis of Deep Belief Network: Based SD-AR Model for Nonlinear Time Series

A hybrid modelling method for time series forecasting based on a linear regression model and deep learning

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Fine tuning process of the proposed DBN-ARX model

Fine tuning process of the proposed DBN-ARX model

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation