skip to main content
10.1145/3297662.3365807acmotherconferencesArticle/Chapter ViewAbstractPublication PagesmedesConference Proceedingsconference-collections
research-article

Facilitating and Managing Machine Learning and Data Analysis Tasks in Big Data Environments using Web and Microservice Technologies

Published: 10 January 2020 Publication History

Abstract

Driven by the great advance of machine learning in a wide range of application areas, the need for developing machine learning frameworks effectively as well as easily usable by novices increased dramatically. Furthermore, building machine learning models in the context of big data environments still represents a great challenge. In the present paper, we tackle these challenges by introducing a new generic framework for efficiently facilitating the training, testing, managing, storing, and retrieving of machine learning models in the context of big data. The framework makes use of a powerful big data software stack and a microservice architecture for a fully manageable and highly scalable solution. A highly configurable user interface is introduced giving the user the ability to easily train, test, and manage machine learning models. Moreover, it automatically indexes models and allows flexible exploration of them in the visual interface. The performance of the new framework is evaluated on state-of-the-arts machine learning algorithms: it is shown that storing and retrieving machine learning models as well as a respective acceptable low overhead demonstrate an efficient approach to facilitate machine learning in big data environments.

References

[1]
Arno Candel, Viraj Parmar, Erin LeDell, and Anisha Arora. 2016. Deep Learning with H2O. H2O. ai Inc (2016).
[2]
Simon Chan, Thomas Stone, Kit Pang Szeto, and Ka Hou Chan. 2013. Prediction IO: a distributed machine learning server for practical software development. In Proceedings of the 22nd ACM international conference on Information & Knowledge Management. ACM, 2493--2496.
[3]
cloud.google.com/automl. April 7, 2019. AutoML. Retrieved April 7, 2019 from https://cloud.google.com/automl/
[4]
Amir Gandomi and Murtaza Haider. 2015. Beyond the hype: Big Data concepts, methods, and analytics. International Journal of Information Management 35, 2 (2015), 137--144.
[5]
Arne Johanson, Sascha Flögel, Christian Dullo, and Wilhelm Hasselbring. 2016. Oceantea: exploring ocean-derived climate data using microservices. (2016).
[6]
Sergio Jurado, Àngela Nebot, Fransisco Mugica, and Narcís Avellana. 2015. Hybrid methodologies for electricity load forecasting: Entropy-based feature selection with machine learning and soft computing techniques. Energy 86 (2015), 276--291.
[7]
A Kala Karun and K Chitharanjan. 2013. A review on hadoop--HDFS infrastructure extensions. In 2013 IEEE conference on information & communication technologies. IEEE, 132--137.
[8]
Igor Kononenko. 2001. Machine learning for medical diagnosis: history, state of the art and perspective. Artificial Intelligence in medicine 23, 1 (2001), 89--109.
[9]
Joseph Kuan. 2012. Learning Highcharts. Packt Publishing Ltd.
[10]
kubeflow.org. April 7, 2019. Kubeflow. Retrieved April 7, 2019 from https://www.kubeflow.org
[11]
Xiangrui Meng, Joseph Bradley, Burak Yavuz, Evan Sparks, Shivaram Venkatara-man, Davies Liu, Jeremy Freeman, DB Tsai, Manish Amde, Sean Owen, et al. 2016. Mllib: Machine learning in Apache Spark. The Journal of Machine Learning Research 17, 1 (2016), 1235--1241.
[12]
mlflow.org. April 7, 2019. MLflow. Retrieved April 7, 2019 from https://mlflow.org/docs/latest/index.html
[13]
Irakli Nadareishvili, Ronnie Mitra, Matt McLarty, and Mike Amundsen. 2016. Microservice architecture: aligning principles, practices, and culture. O'Reilly Media, Inc.
[14]
Jakob Nielsen. 1995. 10 usability heuristics for user interface design. Nielsen Norman Group 1, 1 (1995).
[15]
Regina O Obe and Leo S Hsu. 2017. PostgreSQL: Up and Running: a Practical Guide to the Advanced Open Source Database. O'Reilly Media, Inc.
[16]
Jayashree Padmanabhan and Melvin Jose Johnson Premkumar. 2015. Machine Learning in Automatic Speech Recognition: A Survey. IETE Technical Review 32 (02 2015), 1--12. https://doi.org/10.1080/02564602.2015.1010611
[17]
Fabrizio Sebastiani. 2002. Machine learning in automated text categorization. ACM computing surveys (CSUR) 34, 1 (2002), 1--47.
[18]
Chandani Shrestha. 2016. A Web Based User Interface for Machine Learning Analysis of Health and Education Data. (2016).
[19]
tensorflow.org. 2019. Tensorflow Serving. Retrieved April 7, 2019 from https://www.tensorflow.org/tfx/guide/serving
[20]
Manasi Vartak, Harihar Subramanyam, Wei-En Lee, Srinidhi Viswanathan, Saadiyah Husnoo, Samuel Madden, and Matei Zaharia. 2016. Model DB: a system for machine learning model management. In Proceedings of the Workshop on Human-In-the-Loop Data Analytics. ACM, 14.
[21]
Cyril Voyant, Gilles Notton, Soteris Kalogirou, Marie-Laure Nivet, Christophe Paoli, Fabrice Motte, and Alexis Fouilloy. 2017. Machine learning methods for solar radiation forecasting: A review. Renewable Energy 105 (2017), 569--582.

Cited By

View all
  • (2023)AI-Enabled Secure Microservices in Edge Computing: Opportunities and ChallengesIEEE Transactions on Services Computing10.1109/TSC.2022.315544716:2(1485-1504)Online publication date: 1-Mar-2023
  • (2021)A meta learning approach for automating model selection in big data environments using microservice and container virtualization technologiesInternet of Things10.1016/j.iot.2021.100432(100432)Online publication date: Aug-2021
  • (2020)CMS: A Continuous Machine-Learning and Serving Platform for Industrial Big DataFuture Internet10.3390/fi1206010212:6(102)Online publication date: 10-Jun-2020
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
MEDES '19: Proceedings of the 11th International Conference on Management of Digital EcoSystems
November 2019
350 pages
ISBN:9781450362382
DOI:10.1145/3297662
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 January 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Big Data
  2. Data Analytic
  3. Machine Learning
  4. Microservice
  5. Web-based Applications

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

MEDES '19

Acceptance Rates

MEDES '19 Paper Acceptance Rate 41 of 102 submissions, 40%;
Overall Acceptance Rate 267 of 682 submissions, 39%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)21
  • Downloads (Last 6 weeks)2
Reflects downloads up to 12 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2023)AI-Enabled Secure Microservices in Edge Computing: Opportunities and ChallengesIEEE Transactions on Services Computing10.1109/TSC.2022.315544716:2(1485-1504)Online publication date: 1-Mar-2023
  • (2021)A meta learning approach for automating model selection in big data environments using microservice and container virtualization technologiesInternet of Things10.1016/j.iot.2021.100432(100432)Online publication date: Aug-2021
  • (2020)CMS: A Continuous Machine-Learning and Serving Platform for Industrial Big DataFuture Internet10.3390/fi1206010212:6(102)Online publication date: 10-Jun-2020
  • (2020)A Meta Learning Approach for Automating Model Selection in Big Data Environments using Microservice and Container Virtualization TechnologiesProceedings of the 12th International Conference on Management of Digital EcoSystems10.1145/3415958.3433072(84-91)Online publication date: 2-Nov-2020
  • (2020)Facilitating and Managing Machine Learning and Data Analysis Tasks in Big Data Environments Using Web and Microservice TechnologiesTransactions on Large-Scale Data- and Knowledge-Centered Systems XLV10.1007/978-3-662-62308-4_6(132-171)Online publication date: 20-Sep-2020

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media