skip to main content
10.1145/3297662.3365807acmotherconferencesArticle/Chapter ViewAbstractPublication PagesmedesConference Proceedingsconference-collections
research-article

Facilitating and Managing Machine Learning and Data Analysis Tasks in Big Data Environments using Web and Microservice Technologies

Authors Info & Claims
Published:10 January 2020Publication History

ABSTRACT

Driven by the great advance of machine learning in a wide range of application areas, the need for developing machine learning frameworks effectively as well as easily usable by novices increased dramatically. Furthermore, building machine learning models in the context of big data environments still represents a great challenge. In the present paper, we tackle these challenges by introducing a new generic framework for efficiently facilitating the training, testing, managing, storing, and retrieving of machine learning models in the context of big data. The framework makes use of a powerful big data software stack and a microservice architecture for a fully manageable and highly scalable solution. A highly configurable user interface is introduced giving the user the ability to easily train, test, and manage machine learning models. Moreover, it automatically indexes models and allows flexible exploration of them in the visual interface. The performance of the new framework is evaluated on state-of-the-arts machine learning algorithms: it is shown that storing and retrieving machine learning models as well as a respective acceptable low overhead demonstrate an efficient approach to facilitate machine learning in big data environments.

References

  1. Arno Candel, Viraj Parmar, Erin LeDell, and Anisha Arora. 2016. Deep Learning with H2O. H2O. ai Inc (2016).Google ScholarGoogle Scholar
  2. Simon Chan, Thomas Stone, Kit Pang Szeto, and Ka Hou Chan. 2013. Prediction IO: a distributed machine learning server for practical software development. In Proceedings of the 22nd ACM international conference on Information & Knowledge Management. ACM, 2493--2496.Google ScholarGoogle Scholar
  3. cloud.google.com/automl. April 7, 2019. AutoML. Retrieved April 7, 2019 from https://cloud.google.com/automl/Google ScholarGoogle Scholar
  4. Amir Gandomi and Murtaza Haider. 2015. Beyond the hype: Big Data concepts, methods, and analytics. International Journal of Information Management 35, 2 (2015), 137--144.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Arne Johanson, Sascha Flögel, Christian Dullo, and Wilhelm Hasselbring. 2016. Oceantea: exploring ocean-derived climate data using microservices. (2016).Google ScholarGoogle Scholar
  6. Sergio Jurado, Àngela Nebot, Fransisco Mugica, and Narcís Avellana. 2015. Hybrid methodologies for electricity load forecasting: Entropy-based feature selection with machine learning and soft computing techniques. Energy 86 (2015), 276--291.Google ScholarGoogle ScholarCross RefCross Ref
  7. A Kala Karun and K Chitharanjan. 2013. A review on hadoop--HDFS infrastructure extensions. In 2013 IEEE conference on information & communication technologies. IEEE, 132--137.Google ScholarGoogle ScholarCross RefCross Ref
  8. Igor Kononenko. 2001. Machine learning for medical diagnosis: history, state of the art and perspective. Artificial Intelligence in medicine 23, 1 (2001), 89--109.Google ScholarGoogle Scholar
  9. Joseph Kuan. 2012. Learning Highcharts. Packt Publishing Ltd.Google ScholarGoogle Scholar
  10. kubeflow.org. April 7, 2019. Kubeflow. Retrieved April 7, 2019 from https://www.kubeflow.orgGoogle ScholarGoogle Scholar
  11. Xiangrui Meng, Joseph Bradley, Burak Yavuz, Evan Sparks, Shivaram Venkatara-man, Davies Liu, Jeremy Freeman, DB Tsai, Manish Amde, Sean Owen, et al. 2016. Mllib: Machine learning in Apache Spark. The Journal of Machine Learning Research 17, 1 (2016), 1235--1241.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. mlflow.org. April 7, 2019. MLflow. Retrieved April 7, 2019 from https://mlflow.org/docs/latest/index.htmlGoogle ScholarGoogle Scholar
  13. Irakli Nadareishvili, Ronnie Mitra, Matt McLarty, and Mike Amundsen. 2016. Microservice architecture: aligning principles, practices, and culture. O'Reilly Media, Inc.Google ScholarGoogle Scholar
  14. Jakob Nielsen. 1995. 10 usability heuristics for user interface design. Nielsen Norman Group 1, 1 (1995).Google ScholarGoogle Scholar
  15. Regina O Obe and Leo S Hsu. 2017. PostgreSQL: Up and Running: a Practical Guide to the Advanced Open Source Database. O'Reilly Media, Inc.Google ScholarGoogle Scholar
  16. Jayashree Padmanabhan and Melvin Jose Johnson Premkumar. 2015. Machine Learning in Automatic Speech Recognition: A Survey. IETE Technical Review 32 (02 2015), 1--12. https://doi.org/10.1080/02564602.2015.1010611Google ScholarGoogle Scholar
  17. Fabrizio Sebastiani. 2002. Machine learning in automated text categorization. ACM computing surveys (CSUR) 34, 1 (2002), 1--47.Google ScholarGoogle Scholar
  18. Chandani Shrestha. 2016. A Web Based User Interface for Machine Learning Analysis of Health and Education Data. (2016).Google ScholarGoogle Scholar
  19. tensorflow.org. 2019. Tensorflow Serving. Retrieved April 7, 2019 from https://www.tensorflow.org/tfx/guide/servingGoogle ScholarGoogle Scholar
  20. Manasi Vartak, Harihar Subramanyam, Wei-En Lee, Srinidhi Viswanathan, Saadiyah Husnoo, Samuel Madden, and Matei Zaharia. 2016. Model DB: a system for machine learning model management. In Proceedings of the Workshop on Human-In-the-Loop Data Analytics. ACM, 14.Google ScholarGoogle Scholar
  21. Cyril Voyant, Gilles Notton, Soteris Kalogirou, Marie-Laure Nivet, Christophe Paoli, Fabrice Motte, and Alexis Fouilloy. 2017. Machine learning methods for solar radiation forecasting: A review. Renewable Energy 105 (2017), 569--582.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Facilitating and Managing Machine Learning and Data Analysis Tasks in Big Data Environments using Web and Microservice Technologies

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Other conferences
              MEDES '19: Proceedings of the 11th International Conference on Management of Digital EcoSystems
              November 2019
              350 pages
              ISBN:9781450362382
              DOI:10.1145/3297662

              Copyright © 2019 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 10 January 2020

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article
              • Research
              • Refereed limited

              Acceptance Rates

              MEDES '19 Paper Acceptance Rate41of102submissions,40%Overall Acceptance Rate267of682submissions,39%

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader