skip to main content
10.1145/3468791.3468806acmotherconferencesArticle/Chapter ViewAbstractPublication PagesssdbmConference Proceedingsconference-collections
short-paper

DJEnsemble: a Cost-Based Selection and Allocation of a Disjoint Ensemble of Spatio-temporal Models

Published:11 August 2021Publication History

ABSTRACT

Consider a set of black-box models – each of them independently trained on a different dataset – answering the same predictive spatio-temporal query. Being built in isolation, each model traverses its own life-cycle until it is deployed to production, learning data patterns from different datasets and facing independent hyper-parameter tuning. In order to answer the query, the set of black-box predictors has to be ensembled and allocated to the spatio-temporal query region. However, computing an optimal ensemble is a complex task that involves selecting the appropriate models and defining an effective allocation strategy that maps the models to the query region. In this paper we present DJEnsemble, a cost-based strategy for the automatic selection and allocation of a disjoint ensemble of black-box predictors to answer predictive spatio-temporal queries. We conduct a set of extensive experiments that evaluate DJEnsemble and highlight its efficiency, selecting model ensembles that are almost as efficient as the optimal solution. When compared against the traditional ensemble approach, DJEnsemble achieves up to 4X improvement in execution time and almost 9X improvement in prediction accuracy.

References

  1. Saeed Aghabozorgi, Ali Seyed Shirkhorshidi, and Teh Ying Wah. 2015. Time-Series Clustering—A Decade Review. Information Systems 53, C (2015).Google ScholarGoogle Scholar
  2. L. Ambrogioni, Y. Berezutskaya, U. Guclu, E.W.P. van den Borne, Y. Gucluturk, M.A.J. van Gerven, and E.G.G. Maris. 2017. Bayesian Model Ensembling Using Meta-trained Recurrent Neural Networks. In Proceedings of 2017 NIPS Conference on Neural Information Processing Systems.Google ScholarGoogle Scholar
  3. P. Brazdil, C. Giraud-Carrier, C. Soares, and R. Vilalta. 2009. Metalearning: Applications to Data Mining. Springer.Google ScholarGoogle ScholarCross RefCross Ref
  4. Y. Chalabi and W. Diethelm. 2012. Flexible Distribution Modeling with the Generalized Lambda Distribution. ETH Econohysics Working and White Papers Series (2012).Google ScholarGoogle Scholar
  5. Xingyi Cheng, Ruiqing Zhang, and Wei Xu. 2018. DeepTransport: Learning Spatial-Temporal Dependency for Traffic Condition Forecasting. In Proceedings of 2018 IJCNN International Joint Conference on Neural Networks. 1–8.Google ScholarGoogle ScholarCross RefCross Ref
  6. Noy Cohen-Shapira, Lior Rokach, Bracha Shapira, Gilad Katz, and Roman Vainshtein. 2019. AutoGRD: Model Recommendation Through Graphical Dataset Representation. In Proceedings of 2019 ACM CIKM International Conference on Information and Knowledge Management. 821–830.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Daniel Crankshaw, Xin Wang, Guilio Zhou, Michael J. Franklin, Joseph E. Gonzalez, and Ion Stoica. 2017. Clipper: A Low-Latency Online Prediction Serving System. In Proceedings of 2017 NSDI USENIX Symposium on Networked Systems Design and Implementation. 613–627.Google ScholarGoogle Scholar
  8. Tobias Domhan, Jost Tobias Springenberg, and Frank Hutter. 2015. Speeding Up Automatic Hyperparameter Optimization of Deep Neural Networks by Extrapolation of Learning Curves. In Proceedings of 2015 IJCAI International Joint Conference on Artificial Intelligence. 3460–3468.Google ScholarGoogle Scholar
  9. P. Furtado and P. Baumann. 1999. Storage of Multidimensional Arrays Based on Arbitrary Tiling. In Proceedings of 1999 IEEE ICDE International Conference on Data Engineering.Google ScholarGoogle Scholar
  10. Ping Hu, Dongqi Cai, Shandong Wang, Anbang Yao, and Yurong Chen. 2017. Learning Supervised Scoring Ensemble for Emotion Recognition in the Wild. In Proceedings of 2017 ACM ICMI International Conference on Multimodal Interaction.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. G. Huffman, D. Bolvin, D. Braithwaite, K. Hsu, R. Joyce, and P. Xie. 2014. NASA Global Precipitation Measurement (GPM) Integrated Multi-satellitE Retrievals for GPM (IMERG) v5.2. NASA (2014).Google ScholarGoogle Scholar
  12. F. Hutter, L. Kotthoff, and J. Vanschoren. 2019. Automated Machine Learning: Methods, Systems, Challenges. Springer.Google ScholarGoogle Scholar
  13. Daniel Kang, Raghavan Deepti, Peter Bailis, and Matei Zaharia. 2019. Model Assertion for Monitoring and Improving ML Models. In Proceedings of 2019 SysML Conference.Google ScholarGoogle ScholarCross RefCross Ref
  14. Ji Liu, Noel Moreno Lemus, Esther Pacitti, Fábio Porto, and Patrick Valduriez. 2020. Parallel Computation of PDFs on Big Spatial Data Using Spark. Distributed and Parallel Databases 38, 1 (2020), 63–100.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Hermano Lourenço Souza Lustosa, Anderson Chaves da Silva, Daniel Nascimento Ramos da Silva, Patrick Valduriez, and Fabio Porto. 2020. SAVIME: An Array DBMS for Simulation Analysis and ML Models Prediction. Journal of Information Data Management 11, 3 (2020).Google ScholarGoogle Scholar
  16. Yania Molina Souto, Fabio Porto, Ana Maria C. Moura, and E. Bezerra. 2018. A Spatiotemporal Ensemble Approach to Rainfall Forecasting. In Proceedings of 2018 IJCNN International Joint Conference on Neural Networks. 1–8.Google ScholarGoogle Scholar
  17. Minard Muller. 2007. Information Retrieval for Music and Motion. Springer.Google ScholarGoogle Scholar
  18. NCAR. 2010. NCEP Climate Forecast System Reanalysis (CFSR) 6-hourly Products, January 1979 to December 2010. https://doi.org/10.5065/D69K487Google ScholarGoogle Scholar
  19. John S. Ramberg and Bruce W. Schmeiser. 1974. An Approximate Method for Generating Asymmetric Random Variables. Commun. ACM 17, 2 (1974), 78–82.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Wei Wang, Jinyang Gao, Meihui Zhang, Sheng Wang, Gang Chen, Teck Khim Ng, Beng Chin Ooi, Jie Shao, and Moaz Reyad. 2018. Rafiki: Machine Learning as an Analytics Service System. Proc. VLDB Endow. 12, 2 (2018), 128–140.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Cha Zhang and Yunqian Ma. 2012. Ensemble Machine Learning: Methods and Applications. Springer.Google ScholarGoogle Scholar
  22. X. Zheng, J. Ye, Y. Chen, S. Wistar, J. Li, J. A. Piedra Fernández, M. A. Steinberg, and J. Z. Wang. 2019. Detecting Comma-Shaped Clouds for Severe Weather Forecasting Using Shape and Motion. IEEE Transactions on Geoscience and Remote Sensing 57, 6 (2019), 3788–3801.Google ScholarGoogle ScholarCross RefCross Ref

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    SSDBM '21: Proceedings of the 33rd International Conference on Scientific and Statistical Database Management
    July 2021
    275 pages
    ISBN:9781450384131
    DOI:10.1145/3468791

    Copyright © 2021 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 11 August 2021

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • short-paper
    • Research
    • Refereed limited

    Acceptance Rates

    Overall Acceptance Rate56of146submissions,38%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format