research-article

AdaRNN: Adaptive Learning and Forecasting of Time Series

Authors:

Chongjun WangAuthors Info & Claims

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

Pages 402 - 411

https://doi.org/10.1145/3459637.3482315

Published: 30 October 2021 Publication History

Abstract

Time series has wide applications in the real world and is known to be difficult to forecast. Since its statistical properties change over time, its distribution also changes temporally, which will cause severe distribution shift problem to existing methods. However, it remains unexplored to model the time series in the distribution perspective. In this paper, we term this as Temporal Covariate Shift (TCS). This paper proposes Adaptive RNNs (AdaRNN) to tackle the TCS problem by building an adaptive model that generalizes well on the unseen test data. AdaRNN is sequentially composed of two novel algorithms. First, we propose Temporal Distribution Characterization to better characterize the distribution information in the TS. Second, we propose Temporal Distribution Matching to reduce the distribution mismatch in TS to learn the adaptive TS model. AdaRNN is a general framework with flexible distribution distances integrated. Experiments on human activity recognition, air quality prediction, and financial analysis show that AdaRNN outperforms the latest methods by a classification accuracy of 2.6% and significantly reduces the RMSE by 9.0%. We also show that the temporal distribution matching algorithm can be extended in Transformer structure to boost its performance.

Supplementary Material

MP4 File (adarnn-video-english.mp4)

Video presentation.

Download
208.67 MB

References

[1]

Bandar Almaslukh, Jalal AlMuhtadi, and Abdelmonim Artoli. 2017. An effective deep autoencoder approach for online smartphone-based human activity recognition. Int. J. Comput. Sci. Netw. Secur, Vol. 17, 4 (2017), 160--165.

[2]

Yogesh Balaji, Swami Sankaranarayanan, and Rama Chellappa. 2018. Metareg: Towards domain generalization using meta-regularization. In NeurIPS. 998--1008.

Digital Library

[3]

Karsten M Borgwardt, Arthur Gretton, Malte J Rasch, Hans-Peter Kriegel, Bernhard Schölkopf, and Alex J Smola. 2006. Integrating structured biological data by kernel maximum mean discrepancy. Bioinformatics, Vol. 22, 14 (2006), e49--e57.

Digital Library

[4]

Lingzhen Chen and Alessandro Moschitti. 2019. Transfer learning for sequence labeling using source model and target data. In AAAI, Vol. 33. 6260--6267.

[5]

Edward Choi, Mohammad Taha Bahadori, Jimeng Sun, Joshua Kulas, Andy Schuetz, and Walter Stewart. 2016. Retain: An interpretable predictive model for healthcare using reverse time attention mechanism. In NeurIPS. 3504--3512.

Digital Library

[6]

Fu-Lai Chung, Tak-Chung Fu, Vincent Ng, and Robert WP Luk. 2004. An evolutionary approach to pattern-based time series segmentation. IEEE transactions on evolutionary computation, Vol. 8, 5 (2004), 471--489.

Digital Library

[7]

Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014).

Digital Library

[8]

Wanyun Cui, Guangyu Zheng, Zhiqiang Shen, S. Jiang, and Wei Wang. 2019. Transfer Learning for Sequences via Learning to Collocate. In ICLR.

[9]

Emmanuel de Bézenac, Syama Sundar Rangapuram, Konstantinos Benidis, Michael Bohlke-Schneider, Richard Kurle, Lorenzo Stella, Hilaf Hasson, Patrick Gallinari, and Tim Januschowski. 2020. Normalizing Kalman Filters for Multivariate Time Series Analysis. NeurIPS, Vol. 33 (2020).

[10]

Facebook. Accessed in Jan 2021. FBProphet. https://facebook.github.io/prophet/.

[11]

Hassan Ismail Fawaz, G. Forestier, Jonathan Weber, L. Idoumghar, and Pierre-Alain Muller. 2018. Transfer learning for time series classification. Big Data (2018), 1367--1376.

[12]

Yaroslav Ganin, E. Ustinova, Hana Ajakan, P. Germain, H. Larochelle, F. Laviolette, M. Marchand, and V. Lempitsky. 2016. Domain-Adversarial Training of Neural Networks. JMLR, Vol. 17 (2016), 59:1--59:35.

Digital Library

[13]

Tomasz Górecki and Maciej Luczak. 2015. Multivariate time series classification with parametric derivative dynamic time warping. Expert Syst. Appl., Vol. 42 (2015), 2305--2312.

Digital Library

[14]

A. Gretton, Bharath K. Sriperumbudur, D. Sejdinovic, Heiko Strathmann, S. Balakrishnan, M. Pontil, and K. Fukumizu. 2012. Optimal kernel choice for large-scale two-sample tests. In NeurIPS.

Digital Library

[15]

Priyanka Gupta, P. Malhotra, L. Vig, and G. Shroff. 2018. Transfer Learning for Clinical Time Series Analysis using Recurrent Neural Networks. In ML for Medicine and Healthcare Workshop at KDD.

[16]

David Hallac, Sagar Vare, Stephen Boyd, and Jure Leskovec. 2017. Toeplitz inverse covariance-based clustering of multivariate time series data. In SIGKDD. 215--223.

Digital Library

[17]

Edwin T Jaynes. 1982. On the rationale of maximum-entropy methods. Proc. IEEE, Vol. 70, 9 (1982), 939--952.

[18]

Kaggle. Accessed in Jan 2021. Household electric power consumption dataset. https://www.kaggle.com/uciml/electric-power-consumption-data-set.

[19]

Konstantinos Kalpakis, Dhiral Gada, and Vasundhara Puttagunta. 2001. Distance measures for effective clustering of ARIMA time-series. In ICDM. IEEE, 273--280.

Digital Library

[20]

Eamonn Keogh and Chotirat Ann Ratanamahatana. 2005. Exact indexing of dynamic time warping. Knowl. Inf. Syst., Vol. 7, 3 (2005), 358--386.

[21]

Kazuhiro Kohara, Tsutomu Ishikawa, Yoshimi Fukuhara, and Yukihiro Nakamura. 1997. Stock price prediction using prior knowledge and neural networks. Intell. Syst. Account. Finance Manag., Vol. 6, 1 (1997), 11--22.

[22]

Vitaly Kuznetsov and Mehryar Mohri. 2014. Generalization bounds for time series prediction with non-stationary processes. In ALT. Springer, 260--274.

[23]

Guokun Lai, Wei-Cheng Chang, Yiming Yang, and Hanxiao Liu. 2018. Modeling long-and short-term temporal patterns with deep neural networks. In SIGIR. 95--104.

Digital Library

[24]

Vincent Le Guen and Nicolas Thome. 2020. Probabilistic Time Series Forecasting with Structured Shape and Temporal Diversity. In Advances in Neural Information Processing Systems.

[25]

Qi Lei, Jinfeng Yi, Roman Vaculín, Lingfei Wu, and Inderjit S. Dhillon. 2019. Similarity Preserving Representation Learning for Time Series Clustering. In IJCAI.

Digital Library

[26]

Haoliang Li, Sinno Jialin Pan, Shiqi Wang, and Alex C Kot. 2018. Domain generalization with adversarial feature learning. In CVPR. 5400--5409.

[27]

J. Lines and Anthony J. Bagnall. 2014. Time series classification with ensembles of elastic distance measures. Data Min. Knowl. Discov., Vol. 29 (2014), 565--592.

Digital Library

[28]

Xiaoyan Liu, Zhenjiang Lin, and Huaiqing Wang. 2008. Novel online methods for time series segmentation. IEEE TKDE, Vol. 20, 12 (2008), 1616--1626.

Digital Library

[29]

Yasuko Matsubara, Yasushi Sakurai, Willem G Van Panhuis, and Christos Faloutsos. 2014. FUNNEL: automatic mining of spatially coevolving epidemics. In KDD. 105--114.

Digital Library

[30]

Krikamol Muandet, David Balduzzi, and Bernhard Schölkopf. 2013. Domain generalization via invariant feature representation. In ICML. 10--18.

Digital Library

[31]

Boris N Oreshkin, Dmitri Carpov, Nicolas Chapados, and Yoshua Bengio. 2021. Meta-learning framework with applications to zero-shot time-series forecasting. In AAAI.

[32]

C. Orsenigo and C. Vercellis. 2010. Combining discrete SVM and fixed cardinality warping distances for multivariate time series classification. Pattern Recognit., Vol. 43 (2010), 3787--3794.

Digital Library

[33]

Yao Qin, Dongjin Song, Haifeng Chen, Wei Cheng, Guofei Jiang, and Garrison Cottrell. 2017. A dual-stage attention-based recurrent neural network for time series prediction. In AAAI.

Digital Library

[34]

Goce Ristanoski, Wei Liu, and James Bailey. 2013. Time series forecasting using distribution enhanced linear regression. In PAKDD. 484--495.

[35]

Joshua W Robinson, Alexander J Hartemink, and Zoubin Ghahramani. 2010. Learning Non-Stationary Dynamic Bayesian Networks. Journal of Machine Learning Research, Vol. 11, 12 (2010).

Digital Library

[36]

Sheldon M Ross. 2014. Introduction to stochastic dynamic programming. Academic press.

[37]

David Salinas, Valentin Flunkert, Jan Gasthaus, and Tim Januschowski. 2020. DeepAR: Probabilistic forecasting with autoregressive recurrent networks. Int. J. Forecast, Vol. 36, 3 (2020), 1181--1191.

[38]

P. Schäfer. 2015. Scalable time series classification. Data Min. Knowl. Discov., Vol. 30 (2015), 1273--1298.

Digital Library

[39]

Robert E Schapire. 2003. The boosting approach to machine learning: An overview. Nonlinear estimation and classification (2003), 149--171.

[40]

Rajat Sen, Hsiang-Fu Yu, and Inderjit S Dhillon. 2019. Think globally, act locally: A deep neural network approach to high-dimensional time series forecasting. In NeurIPS. 4837--4846.

Digital Library

[41]

Hidetoshi Shimodaira. 2000. Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of statistical planning and inference, Vol. 90, 2 (2000), 227--244.

[42]

Baochen Sun and Kate Saenko. 2016. Deep CORAL: Correlation Alignment for Deep Domain Adaptation. In ECCV.

[43]

Kerem Sinan Tuncel and Mustafa Gokce Baydogan. 2018. Autoregressive forests for multivariate time series modeling. Pattern recognition, Vol. 73 (2018), 202--215.

[44]

E. Tzeng, Judy Hoffman, Kate Saenko, and Trevor Darrell. 2017. Adversarial Discriminative Domain Adaptation. In CVPR. 2962--2971.

[45]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998--6008.

Digital Library

[46]

LE Vincent and Nicolas Thome. 2019. Shape and Time Distortion Loss for Training Deep Time Series Forecasting Models. In NeurIPS. 4189--4201.

Digital Library

[47]

Jindong Wang, Yiqiang Chen, Wenjie Feng, Han Yu, Meiyu Huang, and Qiang Yang. 2020. Transfer learning with dynamic distribution adaptation. ACM Transactions on Intelligent Systems and Technology (TIST), Vol. 11, 1 (2020), 1--25.

Digital Library

[48]

Jindong Wang, Wenjie Feng, Yiqiang Chen, Han Yu, Meiyu Huang, and Philip S. Yu. 2018. Visual domain adaptation with manifold embedded distribution alignment. In Proceedings of the 26th ACM international conference on Multimedia. 402--410.

Digital Library

[49]

Jindong Wang, Cuiling Lan, Chang Liu, Yidong Ouyang, Wenjun Zeng, and Tao Qin. 2021. Generalizing to Unseen Domains: A Survey on Domain Generalization. In International Joint Conference on Artificial Intelligence (IJCAI).

[50]

Zhiguang Wang and Tim Oates. 2015. Imaging Time-Series to Improve Classification and Imputation. In IJCAI.

Digital Library

[51]

Z. Yang, R. Salakhutdinov, and William W. Cohen. 2017. Transfer Learning for Sequence Tagging with Hierarchical Recurrent Networks. In ICLR.

Digital Library

[52]

Chaohui Yu, Jindong Wang, Yiqiang Chen, and Meiyu Huang. 2019. Transfer learning with dynamic adversarial adaptation network. In 2019 IEEE International Conference on Data Mining (ICDM). IEEE, 778--786.

[53]

G Peter Zhang. 2003. Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing, Vol. 50 (2003), 159--175.

[54]

Shuyi Zhang, Bin Guo, Anlan Dong, Jing He, Ziping Xu, and Song Xi Chen. 2017. Cautionary tales on air-quality improvement in Beijing. Proc. Math. Phys. Eng. Sci., Vol. 473, 2205 (2017), 20170457.

[55]

Yunyue Zhu and Dennis Shasha. 2002. Statstream: Statistical monitoring of thousands of data streams in real time. In VLDB. Elsevier, 358--369.

Digital Library

[56]

Yongchun Zhu, Fuzhen Zhuang, Jindong Wang, Guolin Ke, Jingwu Chen, Jiang Bian, Hui Xiong, and Qing He. 2020. Deep subdomain adaptation network for image classification. IEEE transactions on neural networks and learning systems, Vol. 32, 4 (2020), 1713--1722.

Cited By

Wu ZWan S(2025)A Knowledge-Driven Approach to AI-Based Personalized Test Paper Creation in Programming EducationInternational Journal of Knowledge Management10.4018/IJKM.36982521:1(1-21)Online publication date: 20-Feb-2025
https://doi.org/10.4018/IJKM.369825
Wang FWang YChen WZhao C(2025)An Improved iTransformer with RevIN and SSA for Greenhouse Soil Temperature PredictionAgronomy10.3390/agronomy1501022315:1(223)Online publication date: 17-Jan-2025
https://doi.org/10.3390/agronomy15010223
Jiang HZhang SLiu JPeng XZhong W(2025)Model-free adjustment of reducing agent for SCR device under label deficiency: Regulation-oriented stage-wise reward deep Q-learning with transfer-learned stateProcess Safety and Environmental Protection10.1016/j.psep.2024.12.126195(106745)Online publication date: Mar-2025
https://doi.org/10.1016/j.psep.2024.12.126
Show More Cited By

Index Terms

AdaRNN: Adaptive Learning and Forecasting of Time Series
1. Computing methodologies
  1. Artificial intelligence
  2. Machine learning
    1. Learning paradigms
      1. Multi-task learning
        Transfer learning

Recommendations

Generalized autoregressive moving average modeling of the Bellcore data
LCN '00: Proceedings of the 25th Annual IEEE Conference on Local Computer Networks

Generalized autoregressive moving average (GARMA) models are fitted to the Leland et al. (1994) Bellcore Ethernet trace data. We find the time series to have long memory. In addition, we find evidence for self-similarity, as was also found in earlier ...
A fuzzy seasonal ARIMA model for forecasting
Information processing

This paper proposes a fuzzy seasonal ARIMA (FSARIMA) forecasting model, which combines the advantages of the seasonal time series ARIMA (SARIMA) model and the fuzzy regression model. It is used to forecast two seasonal time series data of the total ...
Hybrid Model Using Interacted-ARIMA and ANN Models for Efficient Forecasting
Multi-disciplinary Trends in Artificial Intelligence
Abstract
When two models applied to the same dataset produce two different sets of forecasts, it is a good practice to combine the forecasts rather than using the better one and discarding the other. Alternatively, the models can also be combined to have a ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

October 2021

4966 pages

ISBN:9781450384469

DOI:10.1145/3459637

General Chairs:
Gianluca Demartini
The University of Queensland, Australia
,
Guido Zuccon
The University of Queensland, Australia
,
Program Chairs:
J. Shane Culpepper
RMIT University, Australia
,
Zi Huang
The University of Queensland, Australia
,
Hanghang Tong
University of Illinois at Urbana-Champaign, USA

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 October 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

CIKM '21

Sponsor:

CIKM '21: The 30th ACM International Conference on Information and Knowledge Management

November 1 - 5, 2021

Queensland, Virtual Event, Australia

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

134
Total Citations
View Citations
1,943
Total Downloads

Downloads (Last 12 months)542
Downloads (Last 6 weeks)58

Reflects downloads up to 02 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wu ZWan S(2025)A Knowledge-Driven Approach to AI-Based Personalized Test Paper Creation in Programming EducationInternational Journal of Knowledge Management10.4018/IJKM.36982521:1(1-21)Online publication date: 20-Feb-2025
https://doi.org/10.4018/IJKM.369825
Wang FWang YChen WZhao C(2025)An Improved iTransformer with RevIN and SSA for Greenhouse Soil Temperature PredictionAgronomy10.3390/agronomy1501022315:1(223)Online publication date: 17-Jan-2025
https://doi.org/10.3390/agronomy15010223
Jiang HZhang SLiu JPeng XZhong W(2025)Model-free adjustment of reducing agent for SCR device under label deficiency: Regulation-oriented stage-wise reward deep Q-learning with transfer-learned stateProcess Safety and Environmental Protection10.1016/j.psep.2024.12.126195(106745)Online publication date: Mar-2025
https://doi.org/10.1016/j.psep.2024.12.126
Wang ZWang YJia FLiu KZhang YZhang FHuang ZLiu Y(2025)Learning from leading indicators to predict long-term dynamics of hourly electricity generation from multiple resourcesNeural Networks10.1016/j.neunet.2025.107268186(107268)Online publication date: Jun-2025
https://doi.org/10.1016/j.neunet.2025.107268
Liu ZFeng YLiu HTang RYang BZhang DJia WTan J(2025)TVC Former: A transformer-based long-term multivariate time series forecasting method using time-variable coupling correlation graphKnowledge-Based Systems10.1016/j.knosys.2025.113147314(113147)Online publication date: Apr-2025
https://doi.org/10.1016/j.knosys.2025.113147
Liu MLai ZFang YLing Q(2025)Day-ahead photovoltaic power forecasting based on corrected numeric weather prediction and domain generalizationEnergy and Buildings10.1016/j.enbuild.2024.115212329(115212)Online publication date: Feb-2025
https://doi.org/10.1016/j.enbuild.2024.115212
Shan MYe CChen PPeng S(2025)A temporal domain generalization method for PM concentration prediction based on adversarial training and deep variational information bottleneckAtmospheric Pollution Research10.1016/j.apr.2025.102472(102472)Online publication date: Feb-2025
https://doi.org/10.1016/j.apr.2025.102472
Yang YChen KChen SChen JNiu RChen WLi Z(2025)CMNet: Fast Time Series Forecasting Based on Hybrid Convolution-MLP ArchitectureBig Data10.1007/978-981-96-1024-2_9(121-133)Online publication date: 24-Jan-2025
https://doi.org/10.1007/978-981-96-1024-2_9
Liu HKamarthi HKong LZhao ZZhang CPrakash BSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Time-series forecasting for out-of-distribution generalization using invariant learningProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693334(31312-31325)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3693334
Li JLi JWang XZhan XZeng Z(2024)A Domain Generalization and Residual Network-Based Emotion Recognition from Physiological SignalsCyborg and Bionic Systems10.34133/cbsystems.00745Online publication date: 5-Feb-2024
https://doi.org/10.34133/cbsystems.0074
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten