short-paper

Automatic Selection of Analytic Platforms with ASAP-DM

Authors:
Manuel Fritz

University of Stuttgart, Germany

University of Stuttgart, Germany
View Profile

,
Gang Shao

SAP SE Shanghai, China

SAP SE Shanghai, China
View Profile

,
Holger Schwarz

University of Stuttgart, Germany

University of Stuttgart, Germany
View Profile

SSDBM '21: Proceedings of the 33rd International Conference on Scientific and Statistical Database ManagementJuly 2021Pages 220–225https://doi.org/10.1145/3468791.3468802

Published:11 August 2021Publication History

SSDBM '21: Proceedings of the 33rd International Conference on Scientific and Statistical Database Management

Pages 220–225

ABSTRACT

The plethora of available analytic platforms escalates the difficulty of selecting the most appropriate platform for a certain data mining task and datasets with varying characteristics. Especially novice analysts experience difficulties to keep up with the latest technical developments. In this demo, we present the ASAP-DM framework. ASAP-DM is able to automatically select a well-performing analytic platform for a given data mining task via an intuitive web interface, thus especially supporting novice analysts. The take-aways for demo attendees are: (1) a good understanding of the challenges of various data mining workloads, dataset characteristics, and the effects on the selection of analytic platforms, (2) useful insights on how ASAP-DM internally works, and (3) how to benefit from ASAP-DM for exploratory data analysis.

References

Divy Agrawal, Sanjay Chawla, Bertty Contreras-Rojas, Ahmed Elmagarmid, Yasser Idris, Zoi Kaoudi, Sebastian Kruse, Ji Lucas, Essam Mansour, Mourad Ouzzani, Paolo Papotti, Jorge Arnulfo Quiane´-Ruiz, Nan Tang, Saravanan Thirumuruganathan, and Anis Troudi. 2018. RHEEM: Enabling cross platform data processing. In Proceedings of the VLDB Endowment, Vol. 11. 1414–1427.Google ScholarDigital Library
Pavel Brazdil, Christophe Giraud Carrier, Carlos Soares, and Ricardo Vilalta. 2008. Metalearning: Applications to data mining. Springer Science & Business Media.Google Scholar
Matthias Feurer, Aaron Klein, Katharina Eggensperger, Jost Springenberg, Manuel Blum, and Frank Hutter. 2015. Efficient and Robust Automated Machine Learning. Advances in Neural Information Processing Systems (2015).Google Scholar
Manuel Fritz, Michael Behringer, and Holger Schwarz. 2020. LOG-Means: Efficiently Estimating the Number of Clusters in Large Datasets. Proceedings of the VLDB Endowment 13, 11 (2020), 2118 – 2131.Google ScholarDigital Library
Manuel Fritz, Osama Muazzen, Michael Behringer, and Holger Schwarz. 2019. ASAP-DM: a framework for automatic selection of analytic platforms for data mining. In Software-Intensive Cyber-Physical Systems. Springer Berlin Heidelberg.Google Scholar
Manuel Fritz, Dennis Tschechlov, and Holger Schwarz. 2020. Learning from past observations: Meta-learning for efficient clustering analyses. In Lecture Notes in Computer Science, Vol. 12393 LNCS. 364–379.Google Scholar
Manuel Fritz, Dennis Tschechlov, and Holger Schwarz. 2021. Efficient Exploratory Clustering Analyses with Qualitative Approximations. In International Conference on Extending Database Technology (EDBT).Google Scholar
Ionel Gog, Malte Schwarzkopf, Natacha Crooks, Matthew P Grosvenor, Allen Clement, and Steven Hand. 2015. Musketeer: All for one, one for all in data processing systems. In Proceedings of the 10th European Conference on Computer Systems, EuroSys 2015.Google ScholarDigital Library
Xiangrui Meng, Joseph Bradley, Burak Yavuz, Evan Sparks, Shivaram Venkataraman, Davies Liu, Jeremy Freeman, D B Tsai, Manish Amde, Sean Owen, and Others. 2016. Mllib: Machine learning in apache spark. The Journal of Machine Learning Research 17, 1 (2016), 1235–1241.Google ScholarDigital Library
Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, and Others. 2011. Scikit-learn: Machine learning in Python. The Journal of Machine Learning Research 12, 10 (2011), 2825–2830.Google ScholarDigital Library
Dennis Tschechlov, Manuel Fritz, and Holger Schwarz. 2021. AutoML4Clust: Efficient AutoML for Clustering Analyses. In International Conference on Extending Database Technology (EDBT).Google Scholar
Joaquin Vanschoren. 2011. Meta-learning architectures: Collecting, organizing and exploiting meta-knowledge. Studies in Computational Intelligence 358 (2011).Google Scholar

Recommendations

Big Data Open Source Platforms
BIGDATACONGRESS '15: Proceedings of the 2015 IEEE International Congress on Big Data

In a global market the capacity to mine and analyze user data is one way for companies to be as close in time and accuracy to the needs of their users. Big Data Platforms are one solution for companies to solve the necessary challenges to accomplish ...
Read More
Influencing factors of mobile instant messaging applications between single- and multi- platform use cases
Highlights
- With the advance of handheld devices, a number of application service provider provided desktop and mobile versions, and users may use either or both of ...
Abstract
The purpose of this study is to investigate the factors that influence the usage intention for different instant messaging application platforms. This study targeted the widespread instant messaging software LINE, for which a survey of ...
Read More
From Big Data to Big Data Mining: Challenges, Issues, and Opportunities
Proceedings of the 18th International Conference on Database Systems for Advanced Applications - Volume 7827

While "big data" has become a highlighted buzzword since last year, "big data mining", i.e., mining from big data, has almost immediately followed up as an emerging, interrelated research area. This paper provides an overview of big data mining and ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SSDBM '21: Proceedings of the 33rd International Conference on Scientific and Statistical Database Management
July 2021
275 pages
ISBN:9781450384131
DOI:10.1145/3468791
Editors:
Qiang Zhu
University of Michigan - Dearborn, USA
,
Xingquan (Hill) Zhu
Florida Atlantic University, USA
,
Yicheng Tu
University of South Florida, USA
,
Zichen (Frank) Xu
Nanchang University, China
,
Anand Kumar
Amazon Inc., USA
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 11 August 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
analytic platform
data mining
platform selection
Qualifiers
- short-paper
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate56of146submissions,38%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 46
  Total Downloads
- Downloads (Last 12 months)5
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Automatic Selection of Analytic Platforms with ASAP-DM

SSDBM '21: Proceedings of the 33rd International Conference on Scientific and Statistical Database Management

ABSTRACT

References

Cited By

Recommendations

Big Data Open Source Platforms

Influencing factors of mobile instant messaging applications between single- and multi- platform use cases

From Big Data to Big Data Mining: Challenges, Issues, and Opportunities

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Automatic Selection of Analytic Platforms with ASAP-DM

SSDBM '21: Proceedings of the 33rd International Conference on Scientific and Statistical Database Management

ABSTRACT

References

Cited By

Recommendations

Big Data Open Source Platforms

Influencing factors of mobile instant messaging applications between single- and multi- platform use cases

From Big Data to Big Data Mining: Challenges, Issues, and Opportunities

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media