Meta-data Augmentation Based Search Strategy Through Generative Adversarial Network for AutoML Model Selection

Liu, Yue; Tian, Wenjie; Li, Shuang

doi:10.1007/978-3-030-75768-7_25

Yue Liu ORCID: orcid.org/0000-0002-2883-5216^15,16,17,
Wenjie Tian¹⁵ &
Shuang Li¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12714))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

1579 Accesses
1 Citations

Abstract

Automated machine learning (AutoML) attempts to automatically build appropriate learning model for given dataset. Despite the recent progress of meta-learning to find good instantiations for AutoML framework, it is still difficult and time-consuming to collect sufficient meta-data with high quality. Therefore, we propose a novel method named Meta-data Augmentation based Search Strategy (MDASS) for AutoML model selection, which is mainly composed of Meta-GAN Surrogate model (MetaGAN) and Self-Adaptive Meta-model (SAM). MetaGAN employs Generative Adversarial Network as surrogate model to collect effective meta-data based on the limited meta-data, which can alleviate the dilemma of meta-overfitting in meta-learning. Based on augmented meta-data, SAM self-adaptively builds multi-objective meta-model, which can select the algorithms with proper trade-off between learning performance and computational budget. Furthermore, for new datasets, MDASS combines promising algorithms and hyperparameter optimization to perform automated model selection under time constraint. Finally, the experiments on various classification datasets from OpenML and algorithms from scikit-learn are conducted. The results show that GAN is promising to incorporate with AutoML and MDASS can perform better than the competing approaches with time budget.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The supplementary material of MDASS is available at https://github.com/wj-tian/MDASS.

References

Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein gan. arXiv preprint arXiv:1701.07875 (2017)
Asuncion, A., Newman, D.: UCI machine learning repository (2007)
Google Scholar
Bergstra, J., Yamins, D., Cox, D.D.: Hyperopt: A python library for optimizing the hyperparameters of machine learning algorithms. In: Proceedings of the 12th Python in Science Conference, vol. 13, p. 20. Citeseer (2013)
Google Scholar
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
Article Google Scholar
Feurer, M., Klein, A., Eggensperger, K., Springenberg, J., Blum, M., Hutter, F.: Efficient and robust automated machine learning. In: Advances in Neural Information Processing Systems, pp. 2962–2970 (2015)
Google Scholar
Feurer, M., Springenberg, J.T., Hutter, F.: Initializing Bayesian hyperparameter optimization via meta-learning. In: Twenty-Ninth AAAI Conference on Artificial Intelligence (2015)
Google Scholar
Fusi, N., Sheth, R., Elibol, M.: Probabilistic matrix factorization for automated machine learning. In: Advances in Neural Information Processing Systems, pp. 3348–3357 (2018)
Google Scholar
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
Google Scholar
Gui, J., Sun, Z., Wen, Y., Tao, D., Ye, J.: A review on generative adversarial networks: algorithms, theory, and applications (2020)
Google Scholar
He, H., Bai, Y., Garcia, E.A., Li, S.: Adasyn: adaptive synthetic sampling approach for imbalanced learning. In: 2008 IEEE International Joint Conference on Neural Networks, pp. 1322–1328 (2008)
Google Scholar
Klein, A., Dai, Z., Hutter, F., Lawrence, N., Gonzalez, J.: Meta-surrogate benchmarking for hyperparameter optimization. In: Advances in Neural Information Processing Systems, pp. 6270–6280 (2019)
Google Scholar
Klein, A., Falkner, S., Bartels, S., Hennig, P., Hutter, F.: Fast Bayesian optimization of machine learning hyperparameters on large datasets. In: Artificial Intelligence and Statistics, pp. 528–536. PMLR (2017)
Google Scholar
Li, Y.F., Wang, H., Wei, T., Tu, W.W.: Towards automated semi-supervised learning. Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 4237–4244 (2019)
Google Scholar
Luo, Y., et al.: Autocross: automatic feature crossing for tabular data in real-world applications. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1936–1945 (2019)
Google Scholar
Mirza, M., Osindero, S.: Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014)
Mısır, M., Sebag, M.: Alors: an algorithm recommender system. Artif. Intell 244, 291–314 (2017)
Article MathSciNet Google Scholar
Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Thornton, C., Hutter, F., Hoos, H.H., Leyton-Brown, K.: Auto-weka: combined selection and hyperparameter optimization of classification algorithms. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 847–855 (2013)
Google Scholar
Vanschoren, J.: Meta-learning: a survey. arXiv preprint arXiv:1810.03548 (2018)
Vanschoren, J., Van Rijn, J.N., Bischl, B., Torgo, L.: Openml: networked science in machine learning. ACM SIGKDD Expl. Newslett 15(2), 49–60 (2014)
Article Google Scholar
Yang, C., Akimoto, Y., Kim, D.W., Udell, M.: Oboe: Collaborative filtering for automl model selection. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1173–1183 (2019)
Google Scholar
Zöller, M.A., Huber, M.F.: Benchmark and survey of automated machine learning frameworks. arXiv preprint arXiv:1904.12054 (2019)

Download references

Acknowledgment

This work is supported by the National Natural Science Foundation of China (No. 52073169) and the State Key Program of National Nature Science Foundation of China (Grant No. 61936001).

Author information

Authors and Affiliations

School of Computer Engineering and Science, Shanghai University, Shanghai, China
Yue Liu, Wenjie Tian & Shuang Li
Shanghai Institute for Advanced Communication and Data Science, Shanghai, China
Yue Liu
Shanghai Engineering Research Center of Intelligent Computing System, Shanghai, 200444, China
Yue Liu

Authors

Yue Liu
View author publications
You can also search for this author in PubMed Google Scholar
Wenjie Tian
View author publications
You can also search for this author in PubMed Google Scholar
Shuang Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yue Liu .

Editor information

Editors and Affiliations

IIIT, Hyderabad, Hyderabad, India
Kamal Karlapalem
Chinese University of Hong Kong, Shatin, Hong Kong
Hong Cheng
Virginia Tech, Arlington, VA, USA
Naren Ramakrishnan
Jawaharlal Nehru University, New Delhi, India
R. K. Agrawal
IIIT Hyderabad, Hyderabad, India
P. Krishna Reddy
University of Minnesota, Minneapolis, MN, USA
Jaideep Srivastava
IIIT Delhi, New Delhi, India
Tanmoy Chakraborty

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, Y., Tian, W., Li, S. (2021). Meta-data Augmentation Based Search Strategy Through Generative Adversarial Network for AutoML Model Selection. In: Karlapalem, K., et al. Advances in Knowledge Discovery and Data Mining. PAKDD 2021. Lecture Notes in Computer Science(), vol 12714. Springer, Cham. https://doi.org/10.1007/978-3-030-75768-7_25

Download citation

DOI: https://doi.org/10.1007/978-3-030-75768-7_25
Published: 08 May 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-75767-0
Online ISBN: 978-3-030-75768-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics