skip to main content
10.1145/3534678.3542616acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
abstract

Multimodal AutoML for Image, Text and Tabular Data

Published: 14 August 2022 Publication History

Abstract

Automated machine learning (AutoML) offers the promise of translating raw data into accurate predictions without the need for significant human effort, expertise, and manual experimentation. In this lecture-style tutorial, we demonstrate fundamental techniques that powers up multimodal AutoML. Different from most AutoML systems that focus on solving tabular tasks that contain categorical and numerical features, we consider supervised learning tasks on various types of data including tabular features, text, and image, as well as their combinations. Rather than technical descriptions of how individual ML models work, we emphasize how to best use models within an overall ML pipeline that takes in raw training data and outputs predictions for test data. A major focus of our tutorial is on automatically building and training deep learning models, which are powerful yet cumbersome to manage manually. Hardly any educational material describes their successful automation. Each topic covered in the tutorial is accompanied by a hands-on Jupyter notebook that implements best practices (which will be available on GitHub before and after the tutorial). Most of the code is adopted from AutoGluon (https://auto.gluon.ai/), a recent open-source AutoML toolkit that is both state-of-the-art and easy-to-use.

References

[1]
Alexander Alexandrov, Konstantinos Benidis, Michael Bohlke-Schneider, Valentin Flunkert, Jan Gasthaus, Tim Januschowski, Danielle C. Maddix, Syama Ranga- puram, David Salinas, Jasper Schulz, Lorenzo Stella, Ali Caner Türkmen, and Yuyang Wang. 2020. GluonTS: Probabilistic and Neural Time Series Modeling in Python. Journal of Machine Learning Research 21, 116 (2020), 1--6.
[2]
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877--1901.
[3]
Rich Caruana, Alexandru Niculescu-Mizil, Geoff Crew, and Alex Ksikes. 2004. Ensemble selection from libraries of models. In Proceedings of the twenty-first international conference on Machine learning. 18.
[4]
Nick Erickson, Jonas Mueller, Alexander Shirkov, Hang Zhang, Pedro Larroy, Mu Li, and Alexander Smola. 2020. AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data. arXiv preprint arXiv:2003.06505 (2020).
[5]
Rasool Fakoor, Jonas W Mueller, Nick Erickson, Pratik Chaudhari, and Alexander J Smola. 2020. Fast, Accurate, and Simple Models for Tabular Data via Augmented Distillation. In Advances in Neural Information Processing Systems, Vol. 33. 8671-- 8681.
[6]
Matthias Feurer, Aaron Klein, Katharina Eggensperger, Jost Tobias Springen- berg, Manuel Blum, and Frank Hutter. 2019. Auto-sklearn: efficient and robust automated machine learning. In Automated Machine Learning. Springer, 113--134.
[7]
Matthias Feurer, Aaron Klein, Katharina Eggensperger, Jost Tobias Springen- berg, Manuel Blum, and Frank Hutter. 2019. Auto-sklearn: efficient and robust automated machine learning. In Automated Machine Learning. Springer, 113--134.
[8]
Pieter Gijsbers, Erin LeDell, Janek Thomas, Sébastien Poirier, Bernd Bischl, and Joaquin Vanschoren. 2019. An open source AutoML benchmark. arXiv preprint arXiv:1907.00909 (2019).
[9]
Jian Guo, He He, Tong He, Leonard Lausen, Mu Li, Haibin Lin, Xingjian Shi, Chenguang Wang, Junyuan Xie, Sheng Zha, et al . 2020. GluonCV and GluonNLP: Deep learning in computer vision and natural language processing. Journal of Machine Learning Research 21, 23 (2020), 1--7.
[10]
Haoyu He, Xingjian Shi, Jonas Mueller, Sheng Zha, Mu Li, and George Karypis. 2021. Distiller: A Systematic Study of Model Distillation Methods in Natural Language Processing. In Proceedings of the Second Workshop on Simple and Efficient Natural Language Processing. Association for Computational Linguistics, Virtual, 119--133. https://doi.org/10.18653/v1/2021.sustainlp-1.13
[11]
Frank Hutter, Lars Kotthoff, and Joaquin Vanschoren. 2018. Automated Machine Learning: Methods, Systems, Challenges. https://www.automl.org/book/.
[12]
Lars Kotthoff, Chris Thornton, Holger H Hoos, Frank Hutter, and Kevin Leyton- Brown. 2017. Auto-WEKA 2.0: Automatic model selection and hyperparameter optimization in WEKA. Journal of Machine Learning Research 18, 25 (2017), 1--5.
[13]
Parul Pandey. 2019. A Deep Dive into H2O's AutoML. http://www.h2o.ai/blog/a- deep-dive-into-h2os-automl/
[14]
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. 2021. Learning transferable visual models from natural language supervision. In International Conference on Machine Learning. PMLR, 8748--8763.
[15]
Xingjian Shi, Jonas Mueller, Nick Erickson, Mu Li, and Alex Smola. 2021. Mul- timodal AutoML on Structured Tables with Text Fields. In 8th ICML Workshop on Automated Machine Learning (AutoML). https://openreview.net/forum?id= OHAIVOOl7Vl
[16]
Xingjian Shi, Jonas Mueller, Nick Erickson, Mu Li, and Alexander J Smola. 2021. Benchmarking Multimodal AutoML for Tabular Data with Text Fields. In Pro- ceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks.
[17]
Chris Thornton, Frank Hutter, Holger H Hoos, and Kevin Leyton-Brown. 2013. Auto-WEKA: Combined selection and hyperparameter optimization of classifica- tion algorithms. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 847--855.
[18]
Kai Ming Ting and Ian H Witten. 1997. Stacking Bagged and Dagged Models. In Proceedings of the 14th International Conference on Machine Learning. 367--375.
[19]
Marc-André Zöller and Marco F Huber. 2019. Benchmark and Survey of Auto- mated Machine Learning Frameworks. arXiv preprint arXiv:1904.12054 (2019).

Cited By

View all
  • (2024)Towards efficient AutoML: a pipeline synthesis approach leveraging pre-trained transformers for multimodal dataMachine Language10.1007/s10994-024-06568-1113:9(7011-7053)Online publication date: 1-Sep-2024
  • (2024)Human-In-The-Loop Based Success Rate Prediction for Medical CrowdfundingArtificial Intelligence Applications and Innovations10.1007/978-3-031-63211-2_8(91-104)Online publication date: 21-Jun-2024
  • (2023)Predicting the Rectal Temperature of Dairy Cows Using Infrared Thermography and Multimodal Machine LearningApplied Sciences10.3390/app13201141613:20(11416)Online publication date: 18-Oct-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
August 2022
5033 pages
ISBN:9781450393850
DOI:10.1145/3534678
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 August 2022

Check for updates

Author Tags

  1. artificial intelligence
  2. autogluon
  3. automated machine learning
  4. automl
  5. bagging
  6. benchmarking
  7. boosting
  8. computer vision
  9. deep learning
  10. ensembling
  11. fine-tuning
  12. forecasting
  13. fusion
  14. hyperparameter tuning
  15. machine learning
  16. multimodal
  17. natural language processing
  18. open source
  19. openml
  20. python
  21. software
  22. stacking
  23. tabular data
  24. transformers

Qualifiers

  • Abstract

Conference

KDD '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)99
  • Downloads (Last 6 weeks)8
Reflects downloads up to 15 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Towards efficient AutoML: a pipeline synthesis approach leveraging pre-trained transformers for multimodal dataMachine Language10.1007/s10994-024-06568-1113:9(7011-7053)Online publication date: 1-Sep-2024
  • (2024)Human-In-The-Loop Based Success Rate Prediction for Medical CrowdfundingArtificial Intelligence Applications and Innovations10.1007/978-3-031-63211-2_8(91-104)Online publication date: 21-Jun-2024
  • (2023)Predicting the Rectal Temperature of Dairy Cows Using Infrared Thermography and Multimodal Machine LearningApplied Sciences10.3390/app13201141613:20(11416)Online publication date: 18-Oct-2023
  • (2023)Ensemble Learning Traffic Model for Sofia: A Case StudyApplied Sciences10.3390/app1308467813:8(4678)Online publication date: 7-Apr-2023
  • (2023)SiWareProceedings of the Thirty-Second International Joint Conference on Artificial Intelligence10.24963/ijcai.2023/829(7115-7118)Online publication date: 19-Aug-2023
  • (2023)Improving Recommender Systems Through the Automation of Design DecisionsProceedings of the 17th ACM Conference on Recommender Systems10.1145/3604915.3608877(1332-1338)Online publication date: 14-Sep-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media