Abstract
Probabilistic forecasting offers insights beyond point estimates, supporting more informed decision-making. This paper introduces the Neural Quantile Function with Recurrent Neural Networks (NQF-RNN), a model for multistep-ahead probabilistic time series forecasting. NQF-RNN combines neural quantile functions with recurrent neural networks, enabling applicability across diverse time series datasets. The model uses a monotonically increasing neural quantile function and is trained with a continuous ranked probability score (CRPS)-based loss function. NQF-RNN’s performance is evaluated on synthetic datasets generated from multiple distributions and six real-world time series datasets with both periodicity and irregularities. NQF-RNN demonstrates competitive performance on synthetic data and outperforms benchmarks on real-world data, achieving lower average forecast errors across most metrics. Notably, NQF-RNN surpasses benchmarks in CRPS, a key probabilistic metric, and tail-weighted CRPS, which assesses tail event forecasting with a narrow prediction interval. The model outperforms other deep learning models by 5% to 41% in CRPS, with improvements of 5% to 53% in left tail-weighted CRPS and 6% to 34% in right tail-weighted CRPS. Against its baseline model, DeepAR, NQF-RNN achieves a 41% improvement in CRPS, indicating its effectiveness in generating reliable prediction intervals. These results highlight NQF-RNN’s robustness in managing complex and irregular patterns in real-world forecasting scenarios.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data Availability
All datasets can be downloaded in the following URLs: − synthetic data
(https://doi.org/10.7910/DVN/W04WWC) − electricity [1, 44]
(https://archive.ics.uci.edu/dataset/321/electricityloaddiagrams20112014) − traffic [1, 44]
(https://archive.ics.uci.edu/dataset/204/pems+sf) − solar [45]
(https://www.nrel.gov/grid/solar-power-data.html) − M4-hourly [46]
(https://github.com/Mcompetitions/M4-methods/tree/master) − tourism-monthly, tourism-quarterly [47]
(https://robjhyndman.com/publications/the-tourism-forecasting-competition)
References
Salinas D, Flunkert V, Gasthaus J, Januschowski T (2020) Deepar: Probabilistic forecasting with autoregressive recurrent networks. Int J Forecast 36(3):1181–1191
Rasul K, Sheikh A-S, Schuster I, Bergmann UM, Vollgraf R (2021) Multivariate probabilistic time series forecasting via conditioned normalizing flows. In: International conference on learning representations. https://openreview.net/forum?id=WiGQBFuVRv
Yang L, Zhang Z, Song Y, Hong S, Xu R, Zhao Y, Zhang W, Cui B, Yang M-H (2023) Diffusion models: A comprehensive survey of methods and applications. ACM Comput Surv 56(4):1–39
Cannon AJ (2011) Quantile regression neural networks: Implementation in r and application to precipitation downscaling. Comput Geosci 37(9):1277–1284
Gasthaus J, Benidis K, Wang Y, Rangapuram SS, Salinas D, Flunkert V, Januschowski T (2019) Probabilistic forecasting with spline quantile function rnns. In: The 22nd international conference on artificial intelligence and statistics, pp 1901–1910. PMLR
Chilinski P, Silva R (2020) Neural likelihoods via cumulative distribution functions. In: Conference on uncertainty in artificial intelligence, pp 420–429. PMLR
Alcántara A, Galván IM, Aler R (2022) Direct estimation of prediction intervals for solar and wind regional energy forecasting with deep neural networks. Eng Appl Artif Intell 114:105128
Sun R, Li C-L, Arik SÖ, Dusenberry MW, Lee C-Y, Pfister T (2023) Neural spline search for quantile probabilistic modeling. In: Proceedings of the AAAI conference on artificial intelligence, vol 37, pp 9927–9934
Wen R, Torkkola K, Narayanaswamy B, Madeka D (2017) A multi-horizon quantile recurrent forecaster. In: NIPS 2017 time series workshop
Zhang X-Y, Watkins C, Kuenzel S (2022) Multi-quantile recurrent neural network for feeder-level probabilistic energy disaggregation considering roof-top solar energy. Eng Appl Artif Intell 110:104707
Gouttes A, Rasul K, Koren M, Stephan J, Naghibi T (2021) Probabilistic time series forecasting with implicit quantile networks. In: ICML 2021 Time Series Workshop
Park Y, Maddix D, Aubet F-X, Kan K, Gasthaus J, Wang Y (2022) Learning quantile functions without quantile crossing for distribution-free time series forecasting. In: International conference on artificial intelligence and statistics, pp 8127–8150 . PMLR
Hu J, Tang J, Liu Z (2024) A novel time series probabilistic prediction approach based on the monotone quantile regression neural network. Inf Sci 654:119844
Cannon AJ (2018) Non-crossing nonlinear regression quantiles by monotone composite quantile regression neural network, with application to rainfall extremes. Stoch Env Res Risk A 32:3207–3225
Koenker R, Hallock KF (2001) Quantile regression. J Econ Perspect 15(4):143–156
Wang Q, Ma Y, Zhao K, Tian Y (2020) A comprehensive survey of loss functions in machine learning. Annals of Data Science, pp 1–26
Zou H, Yuan M (2008) Composite quantile regression and the oracle model selection theory. Ann Stat 36(3):1108–1126. Accessed 2024-03-16
Moon SJ, Jeon J-J, Lee JSH, Kim Y (2021) Learning multiple quantiles with neural networks. J Comput Graph Stat 30(4):1238–1248
Gneiting T, Raftery AE (2007) Strictly proper scoring rules, prediction, and estimation. J Am Stat Assoc 102(477):359–378
Rasul K, Seward C, Schuster I, Vollgraf R: Autoregressive denoising diffusion models for multivariate probabilistic time series forecasting. In: International conference on machine learning, pp 8857–8868 (2021). PMLR
Zhang H, Zhang Z (1999) Feedforward networks with monotone constraints. In: IJCNN’99. international joint conference on neural networks. proceedings (Cat. No. 99CH36339), vol 3, pp 1820–1823 . IEEE
Friederichs P, Thorarinsdottir TL (2012) Forecast verification for extreme value distributions with an application to probabilistic peak wind prediction. Environmetrics 23(7):579–594
Lin F, Zhang Y, Wang K, Wang J, Zhu M (2022) Parametric probabilistic forecasting of solar power with fat-tailed distributions and deep neural networks. IEEE Trans Sustain Energy 13(4):2133–2147
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Jozefowicz R, Zaremba W, Sutskever I (2015) An empirical exploration of recurrent network architectures. In: International conference on machine learning, pp 2342–2350. PMLR
Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. Advances in neural information processing systems 27
Goodfellow I, Bengio Y, Courville A (2016) Deep Learning. MIT Press. http://www.deeplearningbook.org
Williams RJ, Zipser D (1989) A learning algorithm for continually running fully recurrent neural networks. Neural Comput 1(2):270–280
Rudin W, et al (1976) Principles of Mathematical Analysis vol 3. McGraw-hill
Koenker R, Machado JA (1999) Goodness of fit and related inference processes for quantile regression. J Am Stat Assoc 94(448):1296–1310
Taylor JW (1999) Evaluating volatility and interval forecasts. J Forecast 18(2):111–128
Davis PJ, Rabinowitz P (2007) Methods of numerical integration. Courier Corporation
Van Der Walt S, Colbert SC, Varoquaux G (2011) The numpy array: a structure for efficient numerical computation. Comput Sci Eng 13(2):22–30
Gneiting T, Ranjan R (2011) Comparing density forecasts using threshold-and quantile-weighted scoring rules. J Bus Econ Stat 29(3):411–422
Zamo M, Naveau P (2018) Estimation of the continuous ranked probability score with limited information and applications to ensemble weather forecasts. Math Geosci 50(2):209–234
Bollerslev T (1986) Generalized autoregressive conditional heteroskedasticity. J Econ 31(3):307–327
Meinshausen N, Ridgeway G (2006) Quantile regression forests. J Mach Learn Res 7(6)
Petropoulos F, Apiletti D, Assimakopoulos V, Babai MZ, Barrow DK, Taieb SB, Bergmeir C, Bessa RJ, Bijak J, Boylan JE et al (2022) Forecasting: theory and practice. Int J Forecast 38(3):705–871
Hewamalage H, Bergmeir C, Bandara K (2021) Recurrent neural networks for time series forecasting: Current status and future directions. Int J Forecast 37(1):388–427
Kunz M, Birr S, Raslan M, Ma L, Januschowski T (2023) Deep learning based forecasting: a case study from the online fashion industry. In: Forecasting with artificial intelligence: theory and applications, pp 279–311. Springer
Chung J, Gulcehre C, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. In: NIPS 2014 workshop on deep learning, December 2014
Challu C, Olivares KG, Oreshkin BN, Ramirez FG, Canseco MM, Dubrawski A (2023) Nhits: Neural hierarchical interpolation for time series forecasting. Proceedings of the AAAI conference on artificial intelligence 37:6989–6997
Chen Z, Ma M, Li T, Wang H, Li C (2023) Long sequence time-series forecasting with deep learning: A survey. Inf Fusion 97:101819
Yu H-F, Rao N, Dhillon IS (2016) Temporal regularized matrix factorization for high-dimensional time series prediction. Advances in neural information processing systems 29
Lai G, Chang W-C, Yang Y, Liu H (2018) Modeling long-and short-term temporal patterns with deep neural networks. In: The 41st international ACM SIGIR conference on research & development in information retrieval, pp 95–104
Makridakis S, Spiliotis E, Assimakopoulos V (2018) The m4 competition: Results, findings, conclusion and way forward. Int J Forecast 34(4):802–808
Athanasopoulos G, Hyndman RJ, Song H, Wu DC (2011) The tourism forecasting competition. Int J Forecast 27(3):822–844
Funding
This work was supported by the Ministry of Education of the Republic of Korea and the National Research Foundation of Korea (NRF-2023S1A5A2A03086550).
Author information
Authors and Affiliations
Contributions
Jungyoon Song: Conceptualization, Methodology, Visualization, Writing − original draft, Woojin Chang: Writing − review & editing, Supervision, Jae Wook Song: Conceptualization, Writing − review & editing, Validation, Supervision.
Corresponding author
Ethics declarations
Ethical Approval
All datasets used in this article are publicly available, and no consent was required for their use.
Conflicts of Interest
The authors have no competing interests to declare that are relevant to the content of this article.
Competing interests
The authors have no financial interests that could have appeared to influence the work reported in this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A Abbreviations & Symbols
The abbreviations and symbols used in this paper are summarized in Table 9.
Appendix B Proper scoring rules
Based on the work of [19],
for the expected score under Q when the probabilistic forecast is P. The scoring rule S is proper relative to \(\mathcal {P}\) if
It is strictly proper relative to \(\mathcal {P}\) if (B2) holds with equality if and only if \(P = Q\).
Appendix C Detailed preprocessing of datasets
1.1 C.1 Synthetic datasets
The synthetic dataset used in this study comprises four distinct time series input variables. The first input variable consists of scaled observations from a one-lagged time series, denoted by \(z_{i,t-1}\). The second input variable serves as a time index, represented by \(t-1\) within the range \([0, \dots , 41]\). The third input variable represents weekly seasonality, calculated as \(t - 1 \;\text {mod}\; 7\), within the range \([0, \dots , 6]\). The fourth input variable is an item-related covariate, represented by an integer identifier that specifies the group to which a particular observation belongs. To standardize these input features, both the second and third variables are normalized using z-score normalization.
1.2 C.2 Real-world datasets
For the real-world datasets, each time series input is denoted as \(x_{i,t} \in \mathbb {R}^{T \times D}\), where supplementary input variables are associated with time-dependent covariates. The first input variable is the scale-adjusted observation of a one-lagged time series, \(z_{i,t-1}\). Additional input variables include time-related covariates derived from the date information in each dataset, such as hours (ranging from 0 to 23), weekdays (ranging from 0 to 6), months (ranging from 1 to 12), and quarters (ranging from 1 to 4). These covariates are represented by integer values and used as input features.
Another time-related covariate, age, represents the temporal distance from the initial observation within each time series. This variable is adjusted according to the time granularity specific to each dataset.
Additionally, an item-related covariate is included to identify specific series within each dataset. This covariate is also represented by an integer value; for example, in the electricity dataset, it corresponds to the customer index (ranging from 0 to 370) and is linked to the input embedding dimension. The time-related covariates undergo z-score normalization to ensure consistency across varied scales and to enhance model performance. The dataset is generated using a stride equal to the decoder length to avoid overlapping instances in model training, validation, and testing during forecasting.
Appendix D Hyperparameter sets for models
1.1 D.1 Synthetic datasets
For the synthetic datasets, the hyperparameters for each model are defined as follows. For the QRF model, the hyperparameters include the number of estimators, set to either 50 or 100, and the maximum depth, set within the range of 5, 10, or 15. Deep learning-based models use a standardized set of hyperparameters, including a learning rate of either 0.005 or 0.01 and item-related embedding dimensions of 1 or 5. The QRNN, SQF-RNN, IQN-RNN, and NQF-RNN models are configured with RNN hidden dimensions set to 20 or 60, with three hidden layers. The SQF-RNN model specifies the number of knot positions L as 5 or 10. For the NQF-RNN model, the quantile function comprises DNN layers configured as [32, 16] or [8, 4]. The NHITS model incorporates hyperparameters, including an aggregation kernel size defined as [2, 2, 1], [2, 2, 2], or [4, 2, 1], and frequency downsampling set to [4, 2, 1], [14, 7, 1], or [28, 14, 1]. This model also utilizes three blocks, with linear interpolation mode and Maxpool1d as the pooling mode.
1.2 D.2 Real-world datasets
For the real-world datasets, the hyperparameters for each model are as follows. The QRF model includes the number of estimators, set to either 50 or 100, and the maximum depth, set within the range of 5, 10, or 15. Deep learning-based models employ a standard set of hyperparameters, including a learning rate of 0.001 or 0.005 and item-related embedding dimensions of 5 or 20. The QRNN, SQF-RNN, IQN-RNN, and NQF-RNN models have RNN hidden dimensions set to 20, 40, or 60, across three hidden layers. The SQF-RNN model specifies a number of knot positions L as 5 or 10. For the NQF-RNN model, the quantile function incorporates DNN layers configured as [64, 32, 16, 8], [128, 64], [64, 32], or [16, 8]. The NHITS model includes hyperparameters such as an aggregation kernel size set to [2, 2, 1], [2, 2, 2], or [4, 2, 1], and frequency downsampling set to [4, 2, 1], [24, 12, 1], or [168, 24, 1]. This model also employs three blocks, using linear interpolation mode and Maxpool1d as the pooling mode. For each hyperparameter set, the model achieving optimal validation performance is selected for the final evaluation.
Appendix E Visualization of probabilistic forecasting in four real-world datasets
Examples of results for the traffic, M4-hourly, tourism-monthly, and tourism-quarterly datasets are presented in Figs. E1, E2, E3, and E4, respectively.
In panel (a), the solid gray line positioned outside the yellow box represents the conditioning range, which serves as input to the models along with various covariates. Using this input, the models perform multistep-ahead forecasting for the prediction range, depicted by a solid black line. Panels (b) through (j) display the probabilistic forecasting results within the prediction range for different models. The solid red line denotes the point forecasts, while the dotted red, orange, and yellow lines correspond to the quantiles of 0.1/0.9, 0.05/0.95, and 0.01/0.99 of the prediction intervals, respectively.
Figure E1 suggests that the time series across models exhibit comparable patterns but differ in scale of \(y-axis\). Figure E2 shows that, while the time series generally maintain consistent patterns, notable scale differences are also present. Figure E3 indicates that the time series share similar scales but show irregularities. In Fig. E4, the time series predominantly adhere to consistent historical patterns.
In terms of performance, the GARCH and QRF models demonstrate inadequate predictive accuracy. QRNN encounters challenges in forecasting beyond the median range, while DeepAR tends to produce relatively broader prediction intervals compared to other probabilistic time series models. In contrast, other deep learning-based probabilistic time series models exhibit more effective pattern learning and generate reasonable prediction intervals. Notably, NQF-RNN achieves the narrowest prediction intervals across all datasets, effectively capturing the evolution of the original time series.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Song, J., Chang, W. & Song, J.W. NQF-RNN: probabilistic forecasting via neural quantile function-based recurrent neural networks. Appl Intell 55, 183 (2025). https://doi.org/10.1007/s10489-024-06077-7
Accepted:
Published:
DOI: https://doi.org/10.1007/s10489-024-06077-7