Skip to main content

AutoML Meets Time Series Regression Design and Analysis of the AutoSeries Challenge

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track (ECML PKDD 2021)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12979))

  • 1573 Accesses

Abstract

Analyzing better time series with limited human effort is of interest to academia and industry. Driven by business scenarios, we organized the first Automated Time Series Regression challenge (AutoSeries) for the WSDM Cup 2020. We present its design, analysis, and post-hoc experiments. The code submission requirement precluded participants from any manual intervention, testing automated machine learning capabilities of solutions, across many datasets, under hardware and time limitations. We prepared 10 datasets from diverse application domains (sales, power consumption, air quality, traffic, and parking), featuring missing data, mixed continuous and categorical variables, and various sampling rates. Each dataset was split into a training and a test sequence (which was streamed, allowing models to continuously adapt). The setting of “time series regression”, differs from classical forecasting in that covariates at the present time are known. Great strides were made by participants to tackle this AutoSeries problem, as demonstrated by the jump in performance from the sample submission, and post-hoc comparisons with AutoGluon. Simple yet effective methods were used, based on feature engineering, LightGBM, and random search hyper-parameter tuning, addressing all aspects of the challenge. Our post-hoc analyses revealed that providing additional time did not yield significant improvements. The winners’ code was open-sourced (https://www.4paradigm.com/competition/autoseries2020).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://archive.physionet.org/physiobank/database/santa-fe/.

  2. 2.

    http://www.neural-forecasting-competition.com/.

  3. 3.

    https://www.kaggle.com/c/m5-forecasting-accuracy.

  4. 4.

    https://www.kaggle.com/c/web-traffic-time-series-forecasting.

  5. 5.

    http://automl.chalearn.org, http://autodl.chalearn.org.

  6. 6.

    https://www.automl.ai/competitions/3.

  7. 7.

    https://autodl.lri.fr/competitions/64.

  8. 8.

    In some application domains (not considered in this paper), even future \( \{ t+1, \cdots , t+t_{max} \}\)) values of the covariates may be considered. An example would be “simultaneous translation” with a small lag.

  9. 9.

    https://www.kaggle.com/c/web-traffic-time-series-forecasting.

  10. 10.

    https://doc.dataiku.com/dss/latest/time-series/data-formatting.html.

  11. 11.

    https://autodl.lri.fr/.

  12. 12.

    https://hub.docker.com/r/vergilgxw/autotable.

  13. 13.

    https://autodl.lri.fr/competitions/149#results.

  14. 14.

    https://keras-team.github.io/keras-tuner/.

References

  1. Alexandrov, A., et al.: GluonTS: probabilistic and neural time series modeling in Python. J. Mach. Learn. Res. 21(116), 1–6 (2020)

    MATH  Google Scholar 

  2. Erickson, N., et al.: AutoGluon-tabular: robust and accurate AutoML for structured data (2020)

    Google Scholar 

  3. Hutter, F., Kotthoff, L., Vanschoren, J. (eds.): Automated Machine Learning. Methods, Systems, Challenges. The Springer Series on Challenges in Machine Learning. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-05318-5

  4. Hyndman, R.J., Athanasopoulos, G. (eds.): Forecasting: principles and practice. OTexts (2021). https://otexts.com/fpp3/. Accessed 25 Mar 2021

  5. Jin, H., Song, Q., Hu, X.: Auto-Keras: an efficient neural architecture search system. In: KDD (2019)

    Google Scholar 

  6. Kanter, J.M., Veeramachaneni, K.: Deep feature synthesis: towards automating data science endeavors. In: IEEE International Conference on Data Science and Advanced Analytics, DSAA (2015)

    Google Scholar 

  7. Ke, G., et al.: LightGBM: a highly efficient gradient boosting decision tree. In: Advances in Neural Information Processing Systems (2017)

    Google Scholar 

  8. Lai, G., Chang, W., Yang, Y., Liu, H.: Modeling long- and short-term temporal patterns with deep neural networks. In: SIGIR (2018)

    Google Scholar 

  9. Lim, B., Zohren, S.: Time series forecasting with deep learning: a survey (2020)

    Google Scholar 

  10. Liu, Z., et al.: Towards automated computer vision: analysis of the AutoCV challenges 2019. Pattern Recogn. Lett. 135, 196–203 (2020)

    Article  Google Scholar 

  11. Tan, C.W., Bergmeir, C., Petitjean, F., Webb, G.I.: Time series extrinsic regression. Data Min. Knowl. Disc. 35(3), 1032–1060 (2021). https://doi.org/10.1007/s10618-021-00745-9

    Article  Google Scholar 

  12. Taylor, S.J., Letham, B.: Forecasting at scale. PeerJ Prepr. 5, e3190v2 (2017)

    Google Scholar 

  13. Wang, L., Chen, J., Marathe, M.: DEFSI: deep learning based epidemic forecasting with synthetic information. In: AAAI (2019)

    Google Scholar 

  14. Wang, Z., Yan, W., Oates, T.: Time series classification from scratch with deep neural networks: a strong baseline. In: International Joint Conference on Neural Networks (2017)

    Google Scholar 

  15. Yao, Q., et al.: Taking human out of learning applications: a survey on automated machine learning (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhen Xu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Xu, Z., Tu, WW., Guyon, I. (2021). AutoML Meets Time Series Regression Design and Analysis of the AutoSeries Challenge. In: Dong, Y., Kourtellis, N., Hammer, B., Lozano, J.A. (eds) Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track. ECML PKDD 2021. Lecture Notes in Computer Science(), vol 12979. Springer, Cham. https://doi.org/10.1007/978-3-030-86517-7_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-86517-7_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-86516-0

  • Online ISBN: 978-3-030-86517-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics