skip to main content
10.1145/3459637.3482315acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

AdaRNN: Adaptive Learning and Forecasting of Time Series

Published: 30 October 2021 Publication History

Abstract

Time series has wide applications in the real world and is known to be difficult to forecast. Since its statistical properties change over time, its distribution also changes temporally, which will cause severe distribution shift problem to existing methods. However, it remains unexplored to model the time series in the distribution perspective. In this paper, we term this as Temporal Covariate Shift (TCS). This paper proposes Adaptive RNNs (AdaRNN) to tackle the TCS problem by building an adaptive model that generalizes well on the unseen test data. AdaRNN is sequentially composed of two novel algorithms. First, we propose Temporal Distribution Characterization to better characterize the distribution information in the TS. Second, we propose Temporal Distribution Matching to reduce the distribution mismatch in TS to learn the adaptive TS model. AdaRNN is a general framework with flexible distribution distances integrated. Experiments on human activity recognition, air quality prediction, and financial analysis show that AdaRNN outperforms the latest methods by a classification accuracy of 2.6% and significantly reduces the RMSE by 9.0%. We also show that the temporal distribution matching algorithm can be extended in Transformer structure to boost its performance.

Supplementary Material

MP4 File (adarnn-video-english.mp4)
Video presentation.

References

[1]
Bandar Almaslukh, Jalal AlMuhtadi, and Abdelmonim Artoli. 2017. An effective deep autoencoder approach for online smartphone-based human activity recognition. Int. J. Comput. Sci. Netw. Secur, Vol. 17, 4 (2017), 160--165.
[2]
Yogesh Balaji, Swami Sankaranarayanan, and Rama Chellappa. 2018. Metareg: Towards domain generalization using meta-regularization. In NeurIPS. 998--1008.
[3]
Karsten M Borgwardt, Arthur Gretton, Malte J Rasch, Hans-Peter Kriegel, Bernhard Schölkopf, and Alex J Smola. 2006. Integrating structured biological data by kernel maximum mean discrepancy. Bioinformatics, Vol. 22, 14 (2006), e49--e57.
[4]
Lingzhen Chen and Alessandro Moschitti. 2019. Transfer learning for sequence labeling using source model and target data. In AAAI, Vol. 33. 6260--6267.
[5]
Edward Choi, Mohammad Taha Bahadori, Jimeng Sun, Joshua Kulas, Andy Schuetz, and Walter Stewart. 2016. Retain: An interpretable predictive model for healthcare using reverse time attention mechanism. In NeurIPS. 3504--3512.
[6]
Fu-Lai Chung, Tak-Chung Fu, Vincent Ng, and Robert WP Luk. 2004. An evolutionary approach to pattern-based time series segmentation. IEEE transactions on evolutionary computation, Vol. 8, 5 (2004), 471--489.
[7]
Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014).
[8]
Wanyun Cui, Guangyu Zheng, Zhiqiang Shen, S. Jiang, and Wei Wang. 2019. Transfer Learning for Sequences via Learning to Collocate. In ICLR.
[9]
Emmanuel de Bézenac, Syama Sundar Rangapuram, Konstantinos Benidis, Michael Bohlke-Schneider, Richard Kurle, Lorenzo Stella, Hilaf Hasson, Patrick Gallinari, and Tim Januschowski. 2020. Normalizing Kalman Filters for Multivariate Time Series Analysis. NeurIPS, Vol. 33 (2020).
[10]
Facebook. Accessed in Jan 2021. FBProphet. https://facebook.github.io/prophet/.
[11]
Hassan Ismail Fawaz, G. Forestier, Jonathan Weber, L. Idoumghar, and Pierre-Alain Muller. 2018. Transfer learning for time series classification. Big Data (2018), 1367--1376.
[12]
Yaroslav Ganin, E. Ustinova, Hana Ajakan, P. Germain, H. Larochelle, F. Laviolette, M. Marchand, and V. Lempitsky. 2016. Domain-Adversarial Training of Neural Networks. JMLR, Vol. 17 (2016), 59:1--59:35.
[13]
Tomasz Górecki and Maciej Luczak. 2015. Multivariate time series classification with parametric derivative dynamic time warping. Expert Syst. Appl., Vol. 42 (2015), 2305--2312.
[14]
A. Gretton, Bharath K. Sriperumbudur, D. Sejdinovic, Heiko Strathmann, S. Balakrishnan, M. Pontil, and K. Fukumizu. 2012. Optimal kernel choice for large-scale two-sample tests. In NeurIPS.
[15]
Priyanka Gupta, P. Malhotra, L. Vig, and G. Shroff. 2018. Transfer Learning for Clinical Time Series Analysis using Recurrent Neural Networks. In ML for Medicine and Healthcare Workshop at KDD.
[16]
David Hallac, Sagar Vare, Stephen Boyd, and Jure Leskovec. 2017. Toeplitz inverse covariance-based clustering of multivariate time series data. In SIGKDD. 215--223.
[17]
Edwin T Jaynes. 1982. On the rationale of maximum-entropy methods. Proc. IEEE, Vol. 70, 9 (1982), 939--952.
[18]
Kaggle. Accessed in Jan 2021. Household electric power consumption dataset. https://www.kaggle.com/uciml/electric-power-consumption-data-set.
[19]
Konstantinos Kalpakis, Dhiral Gada, and Vasundhara Puttagunta. 2001. Distance measures for effective clustering of ARIMA time-series. In ICDM. IEEE, 273--280.
[20]
Eamonn Keogh and Chotirat Ann Ratanamahatana. 2005. Exact indexing of dynamic time warping. Knowl. Inf. Syst., Vol. 7, 3 (2005), 358--386.
[21]
Kazuhiro Kohara, Tsutomu Ishikawa, Yoshimi Fukuhara, and Yukihiro Nakamura. 1997. Stock price prediction using prior knowledge and neural networks. Intell. Syst. Account. Finance Manag., Vol. 6, 1 (1997), 11--22.
[22]
Vitaly Kuznetsov and Mehryar Mohri. 2014. Generalization bounds for time series prediction with non-stationary processes. In ALT. Springer, 260--274.
[23]
Guokun Lai, Wei-Cheng Chang, Yiming Yang, and Hanxiao Liu. 2018. Modeling long-and short-term temporal patterns with deep neural networks. In SIGIR. 95--104.
[24]
Vincent Le Guen and Nicolas Thome. 2020. Probabilistic Time Series Forecasting with Structured Shape and Temporal Diversity. In Advances in Neural Information Processing Systems.
[25]
Qi Lei, Jinfeng Yi, Roman Vaculín, Lingfei Wu, and Inderjit S. Dhillon. 2019. Similarity Preserving Representation Learning for Time Series Clustering. In IJCAI.
[26]
Haoliang Li, Sinno Jialin Pan, Shiqi Wang, and Alex C Kot. 2018. Domain generalization with adversarial feature learning. In CVPR. 5400--5409.
[27]
J. Lines and Anthony J. Bagnall. 2014. Time series classification with ensembles of elastic distance measures. Data Min. Knowl. Discov., Vol. 29 (2014), 565--592.
[28]
Xiaoyan Liu, Zhenjiang Lin, and Huaiqing Wang. 2008. Novel online methods for time series segmentation. IEEE TKDE, Vol. 20, 12 (2008), 1616--1626.
[29]
Yasuko Matsubara, Yasushi Sakurai, Willem G Van Panhuis, and Christos Faloutsos. 2014. FUNNEL: automatic mining of spatially coevolving epidemics. In KDD. 105--114.
[30]
Krikamol Muandet, David Balduzzi, and Bernhard Schölkopf. 2013. Domain generalization via invariant feature representation. In ICML. 10--18.
[31]
Boris N Oreshkin, Dmitri Carpov, Nicolas Chapados, and Yoshua Bengio. 2021. Meta-learning framework with applications to zero-shot time-series forecasting. In AAAI.
[32]
C. Orsenigo and C. Vercellis. 2010. Combining discrete SVM and fixed cardinality warping distances for multivariate time series classification. Pattern Recognit., Vol. 43 (2010), 3787--3794.
[33]
Yao Qin, Dongjin Song, Haifeng Chen, Wei Cheng, Guofei Jiang, and Garrison Cottrell. 2017. A dual-stage attention-based recurrent neural network for time series prediction. In AAAI.
[34]
Goce Ristanoski, Wei Liu, and James Bailey. 2013. Time series forecasting using distribution enhanced linear regression. In PAKDD. 484--495.
[35]
Joshua W Robinson, Alexander J Hartemink, and Zoubin Ghahramani. 2010. Learning Non-Stationary Dynamic Bayesian Networks. Journal of Machine Learning Research, Vol. 11, 12 (2010).
[36]
Sheldon M Ross. 2014. Introduction to stochastic dynamic programming. Academic press.
[37]
David Salinas, Valentin Flunkert, Jan Gasthaus, and Tim Januschowski. 2020. DeepAR: Probabilistic forecasting with autoregressive recurrent networks. Int. J. Forecast, Vol. 36, 3 (2020), 1181--1191.
[38]
P. Schäfer. 2015. Scalable time series classification. Data Min. Knowl. Discov., Vol. 30 (2015), 1273--1298.
[39]
Robert E Schapire. 2003. The boosting approach to machine learning: An overview. Nonlinear estimation and classification (2003), 149--171.
[40]
Rajat Sen, Hsiang-Fu Yu, and Inderjit S Dhillon. 2019. Think globally, act locally: A deep neural network approach to high-dimensional time series forecasting. In NeurIPS. 4837--4846.
[41]
Hidetoshi Shimodaira. 2000. Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of statistical planning and inference, Vol. 90, 2 (2000), 227--244.
[42]
Baochen Sun and Kate Saenko. 2016. Deep CORAL: Correlation Alignment for Deep Domain Adaptation. In ECCV.
[43]
Kerem Sinan Tuncel and Mustafa Gokce Baydogan. 2018. Autoregressive forests for multivariate time series modeling. Pattern recognition, Vol. 73 (2018), 202--215.
[44]
E. Tzeng, Judy Hoffman, Kate Saenko, and Trevor Darrell. 2017. Adversarial Discriminative Domain Adaptation. In CVPR. 2962--2971.
[45]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998--6008.
[46]
LE Vincent and Nicolas Thome. 2019. Shape and Time Distortion Loss for Training Deep Time Series Forecasting Models. In NeurIPS. 4189--4201.
[47]
Jindong Wang, Yiqiang Chen, Wenjie Feng, Han Yu, Meiyu Huang, and Qiang Yang. 2020. Transfer learning with dynamic distribution adaptation. ACM Transactions on Intelligent Systems and Technology (TIST), Vol. 11, 1 (2020), 1--25.
[48]
Jindong Wang, Wenjie Feng, Yiqiang Chen, Han Yu, Meiyu Huang, and Philip S. Yu. 2018. Visual domain adaptation with manifold embedded distribution alignment. In Proceedings of the 26th ACM international conference on Multimedia. 402--410.
[49]
Jindong Wang, Cuiling Lan, Chang Liu, Yidong Ouyang, Wenjun Zeng, and Tao Qin. 2021. Generalizing to Unseen Domains: A Survey on Domain Generalization. In International Joint Conference on Artificial Intelligence (IJCAI).
[50]
Zhiguang Wang and Tim Oates. 2015. Imaging Time-Series to Improve Classification and Imputation. In IJCAI.
[51]
Z. Yang, R. Salakhutdinov, and William W. Cohen. 2017. Transfer Learning for Sequence Tagging with Hierarchical Recurrent Networks. In ICLR.
[52]
Chaohui Yu, Jindong Wang, Yiqiang Chen, and Meiyu Huang. 2019. Transfer learning with dynamic adversarial adaptation network. In 2019 IEEE International Conference on Data Mining (ICDM). IEEE, 778--786.
[53]
G Peter Zhang. 2003. Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing, Vol. 50 (2003), 159--175.
[54]
Shuyi Zhang, Bin Guo, Anlan Dong, Jing He, Ziping Xu, and Song Xi Chen. 2017. Cautionary tales on air-quality improvement in Beijing. Proc. Math. Phys. Eng. Sci., Vol. 473, 2205 (2017), 20170457.
[55]
Yunyue Zhu and Dennis Shasha. 2002. Statstream: Statistical monitoring of thousands of data streams in real time. In VLDB. Elsevier, 358--369.
[56]
Yongchun Zhu, Fuzhen Zhuang, Jindong Wang, Guolin Ke, Jingwu Chen, Jiang Bian, Hui Xiong, and Qing He. 2020. Deep subdomain adaptation network for image classification. IEEE transactions on neural networks and learning systems, Vol. 32, 4 (2020), 1713--1722.

Cited By

View all
  • (2025)A Knowledge-Driven Approach to AI-Based Personalized Test Paper Creation in Programming EducationInternational Journal of Knowledge Management10.4018/IJKM.36982521:1(1-21)Online publication date: 20-Feb-2025
  • (2025)An Improved iTransformer with RevIN and SSA for Greenhouse Soil Temperature PredictionAgronomy10.3390/agronomy1501022315:1(223)Online publication date: 17-Jan-2025
  • (2025)Model-free adjustment of reducing agent for SCR device under label deficiency: Regulation-oriented stage-wise reward deep Q-learning with transfer-learned stateProcess Safety and Environmental Protection10.1016/j.psep.2024.12.126195(106745)Online publication date: Mar-2025
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management
October 2021
4966 pages
ISBN:9781450384469
DOI:10.1145/3459637
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 October 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. domain adaptation
  2. domain generalization
  3. time series
  4. transfer learning

Qualifiers

  • Research-article

Conference

CIKM '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)542
  • Downloads (Last 6 weeks)58
Reflects downloads up to 02 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2025)A Knowledge-Driven Approach to AI-Based Personalized Test Paper Creation in Programming EducationInternational Journal of Knowledge Management10.4018/IJKM.36982521:1(1-21)Online publication date: 20-Feb-2025
  • (2025)An Improved iTransformer with RevIN and SSA for Greenhouse Soil Temperature PredictionAgronomy10.3390/agronomy1501022315:1(223)Online publication date: 17-Jan-2025
  • (2025)Model-free adjustment of reducing agent for SCR device under label deficiency: Regulation-oriented stage-wise reward deep Q-learning with transfer-learned stateProcess Safety and Environmental Protection10.1016/j.psep.2024.12.126195(106745)Online publication date: Mar-2025
  • (2025)Learning from leading indicators to predict long-term dynamics of hourly electricity generation from multiple resourcesNeural Networks10.1016/j.neunet.2025.107268186(107268)Online publication date: Jun-2025
  • (2025)TVC Former: A transformer-based long-term multivariate time series forecasting method using time-variable coupling correlation graphKnowledge-Based Systems10.1016/j.knosys.2025.113147314(113147)Online publication date: Apr-2025
  • (2025)Day-ahead photovoltaic power forecasting based on corrected numeric weather prediction and domain generalizationEnergy and Buildings10.1016/j.enbuild.2024.115212329(115212)Online publication date: Feb-2025
  • (2025)A temporal domain generalization method for PM concentration prediction based on adversarial training and deep variational information bottleneckAtmospheric Pollution Research10.1016/j.apr.2025.102472(102472)Online publication date: Feb-2025
  • (2025)CMNet: Fast Time Series Forecasting Based on Hybrid Convolution-MLP ArchitectureBig Data10.1007/978-981-96-1024-2_9(121-133)Online publication date: 24-Jan-2025
  • (2024)Time-series forecasting for out-of-distribution generalization using invariant learningProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693334(31312-31325)Online publication date: 21-Jul-2024
  • (2024)A Domain Generalization and Residual Network-Based Emotion Recognition from Physiological SignalsCyborg and Bionic Systems10.34133/cbsystems.00745Online publication date: 5-Feb-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media