abstract

Multimodal AutoML for Image, Text and Tabular Data

Authors:

James Sharpnack,

Alexander SmolaAuthors Info & Claims

KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Pages 4786 - 4787

https://doi.org/10.1145/3534678.3542616

Published: 14 August 2022 Publication History

Abstract

Automated machine learning (AutoML) offers the promise of translating raw data into accurate predictions without the need for significant human effort, expertise, and manual experimentation. In this lecture-style tutorial, we demonstrate fundamental techniques that powers up multimodal AutoML. Different from most AutoML systems that focus on solving tabular tasks that contain categorical and numerical features, we consider supervised learning tasks on various types of data including tabular features, text, and image, as well as their combinations. Rather than technical descriptions of how individual ML models work, we emphasize how to best use models within an overall ML pipeline that takes in raw training data and outputs predictions for test data. A major focus of our tutorial is on automatically building and training deep learning models, which are powerful yet cumbersome to manage manually. Hardly any educational material describes their successful automation. Each topic covered in the tutorial is accompanied by a hands-on Jupyter notebook that implements best practices (which will be available on GitHub before and after the tutorial). Most of the code is adopted from AutoGluon (https://auto.gluon.ai/), a recent open-source AutoML toolkit that is both state-of-the-art and easy-to-use.

References

[1]

Alexander Alexandrov, Konstantinos Benidis, Michael Bohlke-Schneider, Valentin Flunkert, Jan Gasthaus, Tim Januschowski, Danielle C. Maddix, Syama Ranga- puram, David Salinas, Jasper Schulz, Lorenzo Stella, Ali Caner Türkmen, and Yuyang Wang. 2020. GluonTS: Probabilistic and Neural Time Series Modeling in Python. Journal of Machine Learning Research 21, 116 (2020), 1--6.

[2]

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877--1901.

[3]

Rich Caruana, Alexandru Niculescu-Mizil, Geoff Crew, and Alex Ksikes. 2004. Ensemble selection from libraries of models. In Proceedings of the twenty-first international conference on Machine learning. 18.

Digital Library

[4]

Nick Erickson, Jonas Mueller, Alexander Shirkov, Hang Zhang, Pedro Larroy, Mu Li, and Alexander Smola. 2020. AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data. arXiv preprint arXiv:2003.06505 (2020).

[5]

Rasool Fakoor, Jonas W Mueller, Nick Erickson, Pratik Chaudhari, and Alexander J Smola. 2020. Fast, Accurate, and Simple Models for Tabular Data via Augmented Distillation. In Advances in Neural Information Processing Systems, Vol. 33. 8671-- 8681.

[6]

Matthias Feurer, Aaron Klein, Katharina Eggensperger, Jost Tobias Springen- berg, Manuel Blum, and Frank Hutter. 2019. Auto-sklearn: efficient and robust automated machine learning. In Automated Machine Learning. Springer, 113--134.

[7]

Matthias Feurer, Aaron Klein, Katharina Eggensperger, Jost Tobias Springen- berg, Manuel Blum, and Frank Hutter. 2019. Auto-sklearn: efficient and robust automated machine learning. In Automated Machine Learning. Springer, 113--134.

[8]

Pieter Gijsbers, Erin LeDell, Janek Thomas, Sébastien Poirier, Bernd Bischl, and Joaquin Vanschoren. 2019. An open source AutoML benchmark. arXiv preprint arXiv:1907.00909 (2019).

[9]

Jian Guo, He He, Tong He, Leonard Lausen, Mu Li, Haibin Lin, Xingjian Shi, Chenguang Wang, Junyuan Xie, Sheng Zha, et al . 2020. GluonCV and GluonNLP: Deep learning in computer vision and natural language processing. Journal of Machine Learning Research 21, 23 (2020), 1--7.

[10]

Haoyu He, Xingjian Shi, Jonas Mueller, Sheng Zha, Mu Li, and George Karypis. 2021. Distiller: A Systematic Study of Model Distillation Methods in Natural Language Processing. In Proceedings of the Second Workshop on Simple and Efficient Natural Language Processing. Association for Computational Linguistics, Virtual, 119--133. https://doi.org/10.18653/v1/2021.sustainlp-1.13

[11]

Frank Hutter, Lars Kotthoff, and Joaquin Vanschoren. 2018. Automated Machine Learning: Methods, Systems, Challenges. https://www.automl.org/book/.

[12]

Lars Kotthoff, Chris Thornton, Holger H Hoos, Frank Hutter, and Kevin Leyton- Brown. 2017. Auto-WEKA 2.0: Automatic model selection and hyperparameter optimization in WEKA. Journal of Machine Learning Research 18, 25 (2017), 1--5.

[13]

Parul Pandey. 2019. A Deep Dive into H2O's AutoML. http://www.h2o.ai/blog/a- deep-dive-into-h2os-automl/

[14]

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. 2021. Learning transferable visual models from natural language supervision. In International Conference on Machine Learning. PMLR, 8748--8763.

[15]

Xingjian Shi, Jonas Mueller, Nick Erickson, Mu Li, and Alex Smola. 2021. Mul- timodal AutoML on Structured Tables with Text Fields. In 8th ICML Workshop on Automated Machine Learning (AutoML). https://openreview.net/forum?id= OHAIVOOl7Vl

[16]

Xingjian Shi, Jonas Mueller, Nick Erickson, Mu Li, and Alexander J Smola. 2021. Benchmarking Multimodal AutoML for Tabular Data with Text Fields. In Pro- ceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks.

[17]

Chris Thornton, Frank Hutter, Holger H Hoos, and Kevin Leyton-Brown. 2013. Auto-WEKA: Combined selection and hyperparameter optimization of classifica- tion algorithms. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 847--855.

Digital Library

[18]

Kai Ming Ting and Ian H Witten. 1997. Stacking Bagged and Dagged Models. In Proceedings of the 14th International Conference on Machine Learning. 367--375.

[19]

Marc-André Zöller and Marco F Huber. 2019. Benchmark and Survey of Auto- mated Machine Learning Frameworks. arXiv preprint arXiv:1904.12054 (2019).

Cited By

Moharil AVanschoren JSingh PTamburri D(2024)Towards efficient AutoML: a pipeline synthesis approach leveraging pre-trained transformers for multimodal dataMachine Language10.1007/s10994-024-06568-1113:9(7011-7053)Online publication date: 1-Sep-2024
https://dl.acm.org/doi/10.1007/s10994-024-06568-1
Zhou YMa YTang XWang JZheng N(2024)Human-In-The-Loop Based Success Rate Prediction for Medical CrowdfundingArtificial Intelligence Applications and Innovations10.1007/978-3-031-63211-2_8(91-104)Online publication date: 21-Jun-2024
https://doi.org/10.1007/978-3-031-63211-2_8
Brezov DHristov HDimov DAlexiev K(2023)Predicting the Rectal Temperature of Dairy Cows Using Infrared Thermography and Multimodal Machine LearningApplied Sciences10.3390/app13201141613:20(11416)Online publication date: 18-Oct-2023
https://doi.org/10.3390/app132011416
Show More Cited By

Index Terms

Multimodal AutoML for Image, Text and Tabular Data
1. Computing methodologies
2. Software and its engineering
  1. Software notations and tools
    1. Software libraries and repositories

Recommendations

Gradual AutoML using Lale
KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Lale is a sklearn-compatible library for automated machine learning (AutoML). It is open-source (https://github.com/ibm/lale) and addresses the need for gradual automation of machine learning as opposed to offering a black-box AutoML tool. Black-box ...
Classifying Multimodal Data Using Transformers
KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

The increasing prevalence of multimodal data in our society has led to the increased need for machines to make sense of such data holistically. However, data scientists and machine learning engineers aspiring to work on such data face challenges fusing ...
A General Recipe for Automated Machine Learning in Practice
Advances in Artificial Intelligence – IBERAMIA 2022
Abstract
Automated Machine Learning (AutoML) is an area of research that focuses on developing methods to generate machine learning models automatically. The idea of being able to build machine learning models with very little human intervention represents ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 2022

5033 pages

ISBN:9781450393850

DOI:10.1145/3534678

General Chairs:
Aidong Zhang
University of Virginia
,
Huzefa Rangwala
Amazon/George Mason University

Copyright © 2022 Owner/Author.

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 August 2022

Check for updates

Author Tags

Qualifiers

Abstract

Conference

KDD '22

Sponsor:

KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 14 - 18, 2022

Washington DC, USA

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Sponsor:
sigkdd
sigkdd

The 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 3 - 7, 2025

Toronto , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6
Total Citations
View Citations
593
Total Downloads

Downloads (Last 12 months)99
Downloads (Last 6 weeks)8

Reflects downloads up to 15 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Moharil AVanschoren JSingh PTamburri D(2024)Towards efficient AutoML: a pipeline synthesis approach leveraging pre-trained transformers for multimodal dataMachine Language10.1007/s10994-024-06568-1113:9(7011-7053)Online publication date: 1-Sep-2024
https://dl.acm.org/doi/10.1007/s10994-024-06568-1
Zhou YMa YTang XWang JZheng N(2024)Human-In-The-Loop Based Success Rate Prediction for Medical CrowdfundingArtificial Intelligence Applications and Innovations10.1007/978-3-031-63211-2_8(91-104)Online publication date: 21-Jun-2024
https://doi.org/10.1007/978-3-031-63211-2_8
Brezov DHristov HDimov DAlexiev K(2023)Predicting the Rectal Temperature of Dairy Cows Using Infrared Thermography and Multimodal Machine LearningApplied Sciences10.3390/app13201141613:20(11416)Online publication date: 18-Oct-2023
https://doi.org/10.3390/app132011416
Brezov DBurov A(2023)Ensemble Learning Traffic Model for Sofia: A Case StudyApplied Sciences10.3390/app1308467813:8(4678)Online publication date: 7-Apr-2023
https://doi.org/10.3390/app13084678
Bhamidipaty AKhabiri EAgrawal BLi YElkind E(2023)SiWareProceedings of the Thirty-Second International Joint Conference on Artificial Intelligence10.24963/ijcai.2023/829(7115-7118)Online publication date: 19-Aug-2023
https://dl.acm.org/doi/10.24963/ijcai.2023/829
Wegmeth L(2023)Improving Recommender Systems Through the Automation of Design DecisionsProceedings of the 17th ACM Conference on Recommender Systems10.1145/3604915.3608877(1332-1338)Online publication date: 14-Sep-2023
https://dl.acm.org/doi/10.1145/3604915.3608877

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten