skip to main content
10.1145/3459637.3482315acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

AdaRNN: Adaptive Learning and Forecasting of Time Series

Published:30 October 2021Publication History

ABSTRACT

Time series has wide applications in the real world and is known to be difficult to forecast. Since its statistical properties change over time, its distribution also changes temporally, which will cause severe distribution shift problem to existing methods. However, it remains unexplored to model the time series in the distribution perspective. In this paper, we term this as Temporal Covariate Shift (TCS). This paper proposes Adaptive RNNs (AdaRNN) to tackle the TCS problem by building an adaptive model that generalizes well on the unseen test data. AdaRNN is sequentially composed of two novel algorithms. First, we propose Temporal Distribution Characterization to better characterize the distribution information in the TS. Second, we propose Temporal Distribution Matching to reduce the distribution mismatch in TS to learn the adaptive TS model. AdaRNN is a general framework with flexible distribution distances integrated. Experiments on human activity recognition, air quality prediction, and financial analysis show that AdaRNN outperforms the latest methods by a classification accuracy of 2.6% and significantly reduces the RMSE by 9.0%. We also show that the temporal distribution matching algorithm can be extended in Transformer structure to boost its performance.

Skip Supplemental Material Section

Supplemental Material

adarnn-video-english.mp4

mp4

208.7 MB

References

  1. Bandar Almaslukh, Jalal AlMuhtadi, and Abdelmonim Artoli. 2017. An effective deep autoencoder approach for online smartphone-based human activity recognition. Int. J. Comput. Sci. Netw. Secur, Vol. 17, 4 (2017), 160--165.Google ScholarGoogle Scholar
  2. Yogesh Balaji, Swami Sankaranarayanan, and Rama Chellappa. 2018. Metareg: Towards domain generalization using meta-regularization. In NeurIPS. 998--1008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Karsten M Borgwardt, Arthur Gretton, Malte J Rasch, Hans-Peter Kriegel, Bernhard Schölkopf, and Alex J Smola. 2006. Integrating structured biological data by kernel maximum mean discrepancy. Bioinformatics, Vol. 22, 14 (2006), e49--e57. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Lingzhen Chen and Alessandro Moschitti. 2019. Transfer learning for sequence labeling using source model and target data. In AAAI, Vol. 33. 6260--6267.Google ScholarGoogle ScholarCross RefCross Ref
  5. Edward Choi, Mohammad Taha Bahadori, Jimeng Sun, Joshua Kulas, Andy Schuetz, and Walter Stewart. 2016. Retain: An interpretable predictive model for healthcare using reverse time attention mechanism. In NeurIPS. 3504--3512. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Fu-Lai Chung, Tak-Chung Fu, Vincent Ng, and Robert WP Luk. 2004. An evolutionary approach to pattern-based time series segmentation. IEEE transactions on evolutionary computation, Vol. 8, 5 (2004), 471--489. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014).Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Wanyun Cui, Guangyu Zheng, Zhiqiang Shen, S. Jiang, and Wei Wang. 2019. Transfer Learning for Sequences via Learning to Collocate. In ICLR.Google ScholarGoogle Scholar
  9. Emmanuel de Bézenac, Syama Sundar Rangapuram, Konstantinos Benidis, Michael Bohlke-Schneider, Richard Kurle, Lorenzo Stella, Hilaf Hasson, Patrick Gallinari, and Tim Januschowski. 2020. Normalizing Kalman Filters for Multivariate Time Series Analysis. NeurIPS, Vol. 33 (2020).Google ScholarGoogle Scholar
  10. Facebook. Accessed in Jan 2021. FBProphet. https://facebook.github.io/prophet/.Google ScholarGoogle Scholar
  11. Hassan Ismail Fawaz, G. Forestier, Jonathan Weber, L. Idoumghar, and Pierre-Alain Muller. 2018. Transfer learning for time series classification. Big Data (2018), 1367--1376.Google ScholarGoogle Scholar
  12. Yaroslav Ganin, E. Ustinova, Hana Ajakan, P. Germain, H. Larochelle, F. Laviolette, M. Marchand, and V. Lempitsky. 2016. Domain-Adversarial Training of Neural Networks. JMLR, Vol. 17 (2016), 59:1--59:35. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Tomasz Górecki and Maciej Luczak. 2015. Multivariate time series classification with parametric derivative dynamic time warping. Expert Syst. Appl., Vol. 42 (2015), 2305--2312. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. Gretton, Bharath K. Sriperumbudur, D. Sejdinovic, Heiko Strathmann, S. Balakrishnan, M. Pontil, and K. Fukumizu. 2012. Optimal kernel choice for large-scale two-sample tests. In NeurIPS. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Priyanka Gupta, P. Malhotra, L. Vig, and G. Shroff. 2018. Transfer Learning for Clinical Time Series Analysis using Recurrent Neural Networks. In ML for Medicine and Healthcare Workshop at KDD.Google ScholarGoogle Scholar
  16. David Hallac, Sagar Vare, Stephen Boyd, and Jure Leskovec. 2017. Toeplitz inverse covariance-based clustering of multivariate time series data. In SIGKDD. 215--223. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Edwin T Jaynes. 1982. On the rationale of maximum-entropy methods. Proc. IEEE, Vol. 70, 9 (1982), 939--952.Google ScholarGoogle ScholarCross RefCross Ref
  18. Kaggle. Accessed in Jan 2021. Household electric power consumption dataset. https://www.kaggle.com/uciml/electric-power-consumption-data-set.Google ScholarGoogle Scholar
  19. Konstantinos Kalpakis, Dhiral Gada, and Vasundhara Puttagunta. 2001. Distance measures for effective clustering of ARIMA time-series. In ICDM. IEEE, 273--280. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Eamonn Keogh and Chotirat Ann Ratanamahatana. 2005. Exact indexing of dynamic time warping. Knowl. Inf. Syst., Vol. 7, 3 (2005), 358--386. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Kazuhiro Kohara, Tsutomu Ishikawa, Yoshimi Fukuhara, and Yukihiro Nakamura. 1997. Stock price prediction using prior knowledge and neural networks. Intell. Syst. Account. Finance Manag., Vol. 6, 1 (1997), 11--22.Google ScholarGoogle ScholarCross RefCross Ref
  22. Vitaly Kuznetsov and Mehryar Mohri. 2014. Generalization bounds for time series prediction with non-stationary processes. In ALT. Springer, 260--274.Google ScholarGoogle Scholar
  23. Guokun Lai, Wei-Cheng Chang, Yiming Yang, and Hanxiao Liu. 2018. Modeling long-and short-term temporal patterns with deep neural networks. In SIGIR. 95--104. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Vincent Le Guen and Nicolas Thome. 2020. Probabilistic Time Series Forecasting with Structured Shape and Temporal Diversity. In Advances in Neural Information Processing Systems.Google ScholarGoogle Scholar
  25. Qi Lei, Jinfeng Yi, Roman Vaculín, Lingfei Wu, and Inderjit S. Dhillon. 2019. Similarity Preserving Representation Learning for Time Series Clustering. In IJCAI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Haoliang Li, Sinno Jialin Pan, Shiqi Wang, and Alex C Kot. 2018. Domain generalization with adversarial feature learning. In CVPR. 5400--5409.Google ScholarGoogle Scholar
  27. J. Lines and Anthony J. Bagnall. 2014. Time series classification with ensembles of elastic distance measures. Data Min. Knowl. Discov., Vol. 29 (2014), 565--592. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Xiaoyan Liu, Zhenjiang Lin, and Huaiqing Wang. 2008. Novel online methods for time series segmentation. IEEE TKDE, Vol. 20, 12 (2008), 1616--1626. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Yasuko Matsubara, Yasushi Sakurai, Willem G Van Panhuis, and Christos Faloutsos. 2014. FUNNEL: automatic mining of spatially coevolving epidemics. In KDD. 105--114. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Krikamol Muandet, David Balduzzi, and Bernhard Schölkopf. 2013. Domain generalization via invariant feature representation. In ICML. 10--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Boris N Oreshkin, Dmitri Carpov, Nicolas Chapados, and Yoshua Bengio. 2021. Meta-learning framework with applications to zero-shot time-series forecasting. In AAAI.Google ScholarGoogle Scholar
  32. C. Orsenigo and C. Vercellis. 2010. Combining discrete SVM and fixed cardinality warping distances for multivariate time series classification. Pattern Recognit., Vol. 43 (2010), 3787--3794. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Yao Qin, Dongjin Song, Haifeng Chen, Wei Cheng, Guofei Jiang, and Garrison Cottrell. 2017. A dual-stage attention-based recurrent neural network for time series prediction. In AAAI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Goce Ristanoski, Wei Liu, and James Bailey. 2013. Time series forecasting using distribution enhanced linear regression. In PAKDD. 484--495.Google ScholarGoogle Scholar
  35. Joshua W Robinson, Alexander J Hartemink, and Zoubin Ghahramani. 2010. Learning Non-Stationary Dynamic Bayesian Networks. Journal of Machine Learning Research, Vol. 11, 12 (2010). Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Sheldon M Ross. 2014. Introduction to stochastic dynamic programming. Academic press.Google ScholarGoogle Scholar
  37. David Salinas, Valentin Flunkert, Jan Gasthaus, and Tim Januschowski. 2020. DeepAR: Probabilistic forecasting with autoregressive recurrent networks. Int. J. Forecast, Vol. 36, 3 (2020), 1181--1191.Google ScholarGoogle ScholarCross RefCross Ref
  38. P. Schäfer. 2015. Scalable time series classification. Data Min. Knowl. Discov., Vol. 30 (2015), 1273--1298. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Robert E Schapire. 2003. The boosting approach to machine learning: An overview. Nonlinear estimation and classification (2003), 149--171.Google ScholarGoogle Scholar
  40. Rajat Sen, Hsiang-Fu Yu, and Inderjit S Dhillon. 2019. Think globally, act locally: A deep neural network approach to high-dimensional time series forecasting. In NeurIPS. 4837--4846. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Hidetoshi Shimodaira. 2000. Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of statistical planning and inference, Vol. 90, 2 (2000), 227--244.Google ScholarGoogle ScholarCross RefCross Ref
  42. Baochen Sun and Kate Saenko. 2016. Deep CORAL: Correlation Alignment for Deep Domain Adaptation. In ECCV.Google ScholarGoogle Scholar
  43. Kerem Sinan Tuncel and Mustafa Gokce Baydogan. 2018. Autoregressive forests for multivariate time series modeling. Pattern recognition, Vol. 73 (2018), 202--215.Google ScholarGoogle Scholar
  44. E. Tzeng, Judy Hoffman, Kate Saenko, and Trevor Darrell. 2017. Adversarial Discriminative Domain Adaptation. In CVPR. 2962--2971.Google ScholarGoogle Scholar
  45. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998--6008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. LE Vincent and Nicolas Thome. 2019. Shape and Time Distortion Loss for Training Deep Time Series Forecasting Models. In NeurIPS. 4189--4201. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Jindong Wang, Yiqiang Chen, Wenjie Feng, Han Yu, Meiyu Huang, and Qiang Yang. 2020. Transfer learning with dynamic distribution adaptation. ACM Transactions on Intelligent Systems and Technology (TIST), Vol. 11, 1 (2020), 1--25. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Jindong Wang, Wenjie Feng, Yiqiang Chen, Han Yu, Meiyu Huang, and Philip S. Yu. 2018. Visual domain adaptation with manifold embedded distribution alignment. In Proceedings of the 26th ACM international conference on Multimedia. 402--410. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Jindong Wang, Cuiling Lan, Chang Liu, Yidong Ouyang, Wenjun Zeng, and Tao Qin. 2021. Generalizing to Unseen Domains: A Survey on Domain Generalization. In International Joint Conference on Artificial Intelligence (IJCAI).Google ScholarGoogle ScholarCross RefCross Ref
  50. Zhiguang Wang and Tim Oates. 2015. Imaging Time-Series to Improve Classification and Imputation. In IJCAI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Z. Yang, R. Salakhutdinov, and William W. Cohen. 2017. Transfer Learning for Sequence Tagging with Hierarchical Recurrent Networks. In ICLR.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Chaohui Yu, Jindong Wang, Yiqiang Chen, and Meiyu Huang. 2019. Transfer learning with dynamic adversarial adaptation network. In 2019 IEEE International Conference on Data Mining (ICDM). IEEE, 778--786.Google ScholarGoogle ScholarCross RefCross Ref
  53. G Peter Zhang. 2003. Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing, Vol. 50 (2003), 159--175.Google ScholarGoogle ScholarCross RefCross Ref
  54. Shuyi Zhang, Bin Guo, Anlan Dong, Jing He, Ziping Xu, and Song Xi Chen. 2017. Cautionary tales on air-quality improvement in Beijing. Proc. Math. Phys. Eng. Sci., Vol. 473, 2205 (2017), 20170457.Google ScholarGoogle Scholar
  55. Yunyue Zhu and Dennis Shasha. 2002. Statstream: Statistical monitoring of thousands of data streams in real time. In VLDB. Elsevier, 358--369. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Yongchun Zhu, Fuzhen Zhuang, Jindong Wang, Guolin Ke, Jingwu Chen, Jiang Bian, Hui Xiong, and Qing He. 2020. Deep subdomain adaptation network for image classification. IEEE transactions on neural networks and learning systems, Vol. 32, 4 (2020), 1713--1722.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. AdaRNN: Adaptive Learning and Forecasting of Time Series

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management
        October 2021
        4966 pages
        ISBN:9781450384469
        DOI:10.1145/3459637

        Copyright © 2021 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 30 October 2021

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate1,861of8,427submissions,22%

        Upcoming Conference

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader