skip to main content
10.1145/3468791.3468802acmotherconferencesArticle/Chapter ViewAbstractPublication PagesssdbmConference Proceedingsconference-collections
short-paper

Automatic Selection of Analytic Platforms with ASAP-DM

Published:11 August 2021Publication History

ABSTRACT

The plethora of available analytic platforms escalates the difficulty of selecting the most appropriate platform for a certain data mining task and datasets with varying characteristics. Especially novice analysts experience difficulties to keep up with the latest technical developments. In this demo, we present the ASAP-DM framework. ASAP-DM is able to automatically select a well-performing analytic platform for a given data mining task via an intuitive web interface, thus especially supporting novice analysts. The take-aways for demo attendees are: (1) a good understanding of the challenges of various data mining workloads, dataset characteristics, and the effects on the selection of analytic platforms, (2) useful insights on how ASAP-DM internally works, and (3) how to benefit from ASAP-DM for exploratory data analysis.

References

  1. Divy Agrawal, Sanjay Chawla, Bertty Contreras-Rojas, Ahmed Elmagarmid, Yasser Idris, Zoi Kaoudi, Sebastian Kruse, Ji Lucas, Essam Mansour, Mourad Ouzzani, Paolo Papotti, Jorge Arnulfo Quiane´-Ruiz, Nan Tang, Saravanan Thirumuruganathan, and Anis Troudi. 2018. RHEEM: Enabling cross platform data processing. In Proceedings of the VLDB Endowment, Vol. 11. 1414–1427.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Pavel Brazdil, Christophe Giraud Carrier, Carlos Soares, and Ricardo Vilalta. 2008. Metalearning: Applications to data mining. Springer Science & Business Media.Google ScholarGoogle Scholar
  3. Matthias Feurer, Aaron Klein, Katharina Eggensperger, Jost Springenberg, Manuel Blum, and Frank Hutter. 2015. Efficient and Robust Automated Machine Learning. Advances in Neural Information Processing Systems (2015).Google ScholarGoogle Scholar
  4. Manuel Fritz, Michael Behringer, and Holger Schwarz. 2020. LOG-Means: Efficiently Estimating the Number of Clusters in Large Datasets. Proceedings of the VLDB Endowment 13, 11 (2020), 2118 – 2131.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Manuel Fritz, Osama Muazzen, Michael Behringer, and Holger Schwarz. 2019. ASAP-DM: a framework for automatic selection of analytic platforms for data mining. In Software-Intensive Cyber-Physical Systems. Springer Berlin Heidelberg.Google ScholarGoogle Scholar
  6. Manuel Fritz, Dennis Tschechlov, and Holger Schwarz. 2020. Learning from past observations: Meta-learning for efficient clustering analyses. In Lecture Notes in Computer Science, Vol. 12393 LNCS. 364–379.Google ScholarGoogle Scholar
  7. Manuel Fritz, Dennis Tschechlov, and Holger Schwarz. 2021. Efficient Exploratory Clustering Analyses with Qualitative Approximations. In International Conference on Extending Database Technology (EDBT).Google ScholarGoogle Scholar
  8. Ionel Gog, Malte Schwarzkopf, Natacha Crooks, Matthew P Grosvenor, Allen Clement, and Steven Hand. 2015. Musketeer: All for one, one for all in data processing systems. In Proceedings of the 10th European Conference on Computer Systems, EuroSys 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Xiangrui Meng, Joseph Bradley, Burak Yavuz, Evan Sparks, Shivaram Venkataraman, Davies Liu, Jeremy Freeman, D B Tsai, Manish Amde, Sean Owen, and Others. 2016. Mllib: Machine learning in apache spark. The Journal of Machine Learning Research 17, 1 (2016), 1235–1241.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, and Others. 2011. Scikit-learn: Machine learning in Python. The Journal of Machine Learning Research 12, 10 (2011), 2825–2830.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Dennis Tschechlov, Manuel Fritz, and Holger Schwarz. 2021. AutoML4Clust: Efficient AutoML for Clustering Analyses. In International Conference on Extending Database Technology (EDBT).Google ScholarGoogle Scholar
  12. Joaquin Vanschoren. 2011. Meta-learning architectures: Collecting, organizing and exploiting meta-knowledge. Studies in Computational Intelligence 358 (2011).Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    SSDBM '21: Proceedings of the 33rd International Conference on Scientific and Statistical Database Management
    July 2021
    275 pages
    ISBN:9781450384131
    DOI:10.1145/3468791

    Copyright © 2021 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 11 August 2021

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • short-paper
    • Research
    • Refereed limited

    Acceptance Rates

    Overall Acceptance Rate56of146submissions,38%
  • Article Metrics

    • Downloads (Last 12 months)5
    • Downloads (Last 6 weeks)0

    Other Metrics

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format