What is the best RNN-cell structure to forecast each time series behavior?
Introduction
Many real-world prediction problems involve a temporal dimension and typically require the estimation of numerical sequential data referred to as time series forecasting. Time series forecasting is one of the major stones in data science playing a pivotal role in almost all domains, including meteorology (Murat, Malinowska, Gos, & Krzyszczak, 2018), natural disasters control (Erdelj, Król, & Natalizio, 2017), energy (Bourdeau, Zhai, Nefzaoui, Guo, & Chatellier, 2019), manufacturing (Wang & Chen, 2018), finance (Liu, 2019), econometrics (Siami-Namini & Namin, 2018), telecommunication (Maeng, Kim, & Shin, 2020), healthcare (Khaldi, E. Afia, & Chiheb, 2019b) to name a few. Accurate time series forecasting requires robust forecasting models.
Currently, Recurrent Neural Network (RNN) models are one of the most popular machine learning models in sequential data modeling, including natural language, image/video captioning, and forecasting (Chimmula and Zhang, 2020, Sutskever et al., 2014, Vinyals et al., 2015). Such RNN models are built as a sequence of the same cell structure, for example, ELMAN cell, Long–Short Term Memory (LSTM) cell or Gated Recurrent Unit (GRU) cell. The simplest RNN cell is ELMAN, it includes one layer of hidden neurons. While, LSTM and GRU cells incorporate a gating mechanism, three gates in LSTM and two gates in GRU, where each gate is a layer of hidden neurons. Many other cell structures have been introduced in the literature (Lu and Salem, 2017, Mikolov et al., 2014, Pulver and Lyu, 2017, Zhou et al., 2016). However, to solve time series forecasting tasks, the building of RNN models is typically limited to the three aforementioned cell structures (Alkhayat and Mehmood, 2021, Liu et al., 2021, Rajagukguk et al., 2020, Runge and Zmeureanu, 2021, Sezer et al., 2020), as they provide very good accuracy (Runge and Zmeureanu, 2021, Sezer et al., 2020).
Nevertheless, building robust RNN models for time series forecasting is still a challenging task as there does not exist yet a clear understanding of times series data itself and hence there exist very little knowledge about what cell structure is the most appropriate for each data type. In general, when facing a new problem, practitioners select one of the most popular cells, usually LSTM, and use it as a building block for the RNN model without any guarantee on the appropriateness of this cell to the current data. The objective of this work is two-fold. It presents a comprehensive characterization of time series behaviors and provides guidelines on the best RNN cell structure for each behavior. As far as we know, this is the first work providing such insights. The main contributions of this study can be summarized as follows:
- •
To provide a better understanding of times series data by presenting a comprehensive characterization of their behaviors.
- •
To determine the most appropriate cell structure for each time series behavior (i.e., whether a specific cell structure should be avoided for certain behaviors).
- •
To identify differences in predictability between behaviors (i.e., whether certain behaviors are easier or harder to predict across all cell models).
- •
To provide useful guidelines that can assist decision-makers and scholars in the process of selecting the most suitable RNN-cell structure from both, a computational and performance point of view.
The remainder of this study is organized as follows: Section 2 states the related works. Section 3 presents a taxonomy of time series behaviors. Section 4 presents a taxonomy of RNN cells. Section 5 describes the experiment. Section 6 exhibits and discusses the obtained results. Finally, the last section concludes the findings and spots light on future research directions.
Section snippets
Related works
The last decades have known an explosion of time series data acquired by automated data collection devices such as monitors, IoT devices, and sensors (Bourdeau et al., 2019, Erdelj et al., 2017, Murat et al., 2018). The collected time series describes different quantitative values: stock price, amount of sales, electricity load demand, weather temperature, etc. In parallel, a large number of comparative studies have been carried out in the forecasting area (Athiyarath et al., 2020, Bianchi et
Taxonomy of times series behaviors
As far as we know, this is the first work introducing a complete formal characterization of real-world time series. Time series emerging from real-world applications can either follow a stochastic mechanism or a chaotic mechanism and are usually contaminated by white noise (Boaretto et al., 2021, Box et al., 2015, Cencini et al., 2000, Wales, 1991, Zunino et al., 2012).
Taxonomy of RNN cells
Humans do not start their thinking from zero every second, our thoughts have persistence in the memory of our brains. For example, as the reader reads this paper, he/she understands each word based on his/her understanding of the words before. The absence of memory is the major shortcoming in traditional machine learning models, particularly in feed-forward neural networks (FNNs). To overcome this limitation, RNNs integrate the concept of feedback connections in their structure (Fig. 6, where
Experimental structure
Two experiments have been carried out in this study. The first experiment analyzes the utility of each LSTM-Vanilla cell component in forecasting the five time series behaviors.While, the second experiment evaluates different variants of RNN cell structures in forecasting these behaviors. In this section, we first describe the process we followed to generate the dataset for each time series behavior (Section 5.1). Then, we present the selected models for the first and the second experiment
Results and discussion
In this section, we present the results of the two conducted experiments: (1) The first experiment consists of evaluating and analyzing the role of each component in the LSTM-Vanilla cell with respect to the five time series behaviors. The evaluated architectures were generated by removing (NIG, NFG, NOG, NIAF, NFAF, NOAF, and NCAF), adding (PC and FGR), or substituting (FB1 and CIFG) one cell component. (2) The second experiment aims at evaluating and analyzing the performance of a multitude
Conclusions
In this paper, we proposed a comprehensive taxonomy of the main time series behaviors, which are: deterministic, random-walk, nonlinear, long-memory, and chaotic. Then, we conducted two experiments to show the best RNN cell structure for each behavior. In the first experiment, we evaluated the LSTM-Vanilla model and 11 of its variants created based on one alteration in its basic architecture that consists in (1) removing (NIG, NFG, NOG, NIAF, NFAF, NOAF, and NCAF), (2) adding (PC and FGR), or
CRediT authorship contribution statement
Rohaifa Khaldi: Conceptualization, Methodology, Software, Formal analysis, Investigation, Writing – original draft, Writing – review & editing, Visualization. Abdellatif El Afia: Validation, Supervision. Raddouane Chiheb: Supervision. Siham Tabik: Methodology, Validation, Resources, Writing – original draft, Writing – review & editing.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This work was partially supported by DETECTOR (A-RNM-256-UGR18 Universidad de Granada/FEDER), LifeWatch SmartEcomountains (LifeWatch-2019-10-UGR-01 Ministerio de Ciencia e Innovación/Universidad de Granada/FEDER), DeepL-ISCO (A-TIC-458-UGR18 Ministerio de Ciencia e Innovación/FEDER), and BigDDL-CET (P18-FR-4961 Ministerio de Ciencia e Innovación/Universidad de Granada/FEDER).
References (124)
- et al.
A review and taxonomy of wind and solar energy forecasting methods based on deep learning
Energy and AI
(2021) - et al.
High level chaos in the exchange and index markets
Chaos, Solitons & Fractals
(2013) - et al.
Modeling and forecasting building energy consumption: A review of data-driven techniques
Sustainable Cities and Society
(2019) - et al.
Cooperative coevolution of ELMAN recurrent neural networks for chaotic time series prediction
Neurocomputing
(2012) - et al.
Time series forecasting of COVID-19 transmission in canada using LSTM networks
Chaos, Solitons & Fractals
(2020) Neural networks for pattern-based short-term load forecasting: A comparative study
Neurocomputing
(2016)Finding structure in time
Cognitive Science
(1990)- et al.
Wireless sensor networks and multi-UAV systems for natural disaster management
Computer Networks
(2017) Evapotranspiration evaluation models based on machine learning algorithms a comparative study
Agricultural Water Management
(2019)- et al.
Dimensions and entropies of strange attractors from a fluctuating dynamics approach
Physica D: Nonlinear Phenomena
(1984)
Exceptional events as evidence for determinism
Physica D: Nonlinear Phenomena
Forecasting of weekly patient visits to emergency department: Real case study
Procedia Computer Science
Testing the null hypothesis of stationarity against the alternative of a unit root: How sure are we that economic time series have a unit root?
Journal of Econometrics
Chaotic time series prediction and additive white Gaussian noise
Physics Letters A
Novel volatility forecasting using deep learning–long short term memory recurrent neural networks
Expert Systems with Applications
Intelligent modeling strategies for forecasting air quality time series: A review
Applied Soft Computing
Demand forecasting for the 5G service market considering consumer preference and purchase delay behavior
Telematics and Informatics
A new test for chaos and determinism based on symbolic dynamics
Journal of Economic Behavior & Organization
Hydrological time series forecasting using simple combinations: Big data testing and investigations on one-year ahead river flow predictability
Journal of Hydrology
Evaluation of statistical and machine learning models for time series prediction: Identifying the state-of-the-art and the best conditions for the use of each model
Information Sciences
Time series forecasting of petroleum production using deep LSTM recurrent networks
Neurocomputing
Nonstationary time series transformation methods: An experimental review
Knowledge-Based Systems
Forecasting of noisy chaotic systems with deep neural networks
Chaos, Solitons & Fractals
Time series prediction with simple recurrent neural networks
Bayero Journal of Pure and Applied Sciences
Fitting autoregressive models for prediction
Annals of the Institute of Statistical Mathematics
Selection of regressors
International Economic Review
A comparative study and analysis of time series forecasting techniques
SN Computer Science
Should we really use post-hoc tests based on mean-ranks?
The Journal of Machine Learning Research
An overview and comparative analysis of recurrent neural networks for short term load forecasting
Discriminating chaotic and stochastic time series using permutation entropy and artificial neural networks
Scientific Reports
Fractional neuro-sequential ARFIMA-LSTM for financial market forecasting
IEEE Access
Chaos or noise: Difficulties of a distinction
Physical Review E
Learning phrase representations using RNN encoder-decoder for statistical machine translation
Precipitation forecasting using classification and regression trees (CART) model: A comparative study of different approaches
Environmental Earth Sciences
Nn5 forecasting competition for artificial neural networks & computational intelligence
The UCR time series archive
IEEE/CAA Journal of Automatica Sinica
Statistical comparisons of classifiers over multiple data sets
The Journal of Machine Learning Research
Gate-variants of gated recurrent unit (GRU) neural networks
Distribution of the estimators for autoregressive time series with a unit root
Journal of the American Statistical Association
A comparative study of time series forecasting methods for short term electric energy consumption prediction in smart buildings
Energies
Ergodic theory of chaos and strange attractors
The Theory of Chaotic Attractors
Counterexamples to parsimony and BIC
Annals of the Institute of Statistical Mathematics
A comparison of alternative tests of significance for the problem of M rankings
The Annals of Mathematical Statistics
An extension on” statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons
Journal of Machine Learning Research
Recurrent nets that time and count
LSTMrecurrent networks learn simple context-free and context-sensitive languages
IEEE Transactions on Neural Networks
Learning to forget: Continual prediction with LSTM
Neural Computation
Cited by (11)
Insight into glacio-hydrologicalprocesses using explainable machine-learning (XAI) models
2024, Journal of HydrologyGATE: A guided approach for time series ensemble forecasting
2024, Expert Systems with ApplicationsHuman-cognition-inspired deep model with its application to ocean wave height forecasting
2023, Expert Systems with Applications