Skip to main content

Meta-data Augmentation Based Search Strategy Through Generative Adversarial Network for AutoML Model Selection

  • Conference paper
  • First Online:
Advances in Knowledge Discovery and Data Mining (PAKDD 2021)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12714))

Included in the following conference series:

Abstract

Automated machine learning (AutoML) attempts to automatically build appropriate learning model for given dataset. Despite the recent progress of meta-learning to find good instantiations for AutoML framework, it is still difficult and time-consuming to collect sufficient meta-data with high quality. Therefore, we propose a novel method named Meta-data Augmentation based Search Strategy (MDASS) for AutoML model selection, which is mainly composed of Meta-GAN Surrogate model (MetaGAN) and Self-Adaptive Meta-model (SAM). MetaGAN employs Generative Adversarial Network as surrogate model to collect effective meta-data based on the limited meta-data, which can alleviate the dilemma of meta-overfitting in meta-learning. Based on augmented meta-data, SAM self-adaptively builds multi-objective meta-model, which can select the algorithms with proper trade-off between learning performance and computational budget. Furthermore, for new datasets, MDASS combines promising algorithms and hyperparameter optimization to perform automated model selection under time constraint. Finally, the experiments on various classification datasets from OpenML and algorithms from scikit-learn are conducted. The results show that GAN is promising to incorporate with AutoML and MDASS can perform better than the competing approaches with time budget.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The supplementary material of MDASS is available at https://github.com/wj-tian/MDASS.

References

  1. Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein gan. arXiv preprint arXiv:1701.07875 (2017)

  2. Asuncion, A., Newman, D.: UCI machine learning repository (2007)

    Google Scholar 

  3. Bergstra, J., Yamins, D., Cox, D.D.: Hyperopt: A python library for optimizing the hyperparameters of machine learning algorithms. In: Proceedings of the 12th Python in Science Conference, vol. 13, p. 20. Citeseer (2013)

    Google Scholar 

  4. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)

    Article  Google Scholar 

  5. Feurer, M., Klein, A., Eggensperger, K., Springenberg, J., Blum, M., Hutter, F.: Efficient and robust automated machine learning. In: Advances in Neural Information Processing Systems, pp. 2962–2970 (2015)

    Google Scholar 

  6. Feurer, M., Springenberg, J.T., Hutter, F.: Initializing Bayesian hyperparameter optimization via meta-learning. In: Twenty-Ninth AAAI Conference on Artificial Intelligence (2015)

    Google Scholar 

  7. Fusi, N., Sheth, R., Elibol, M.: Probabilistic matrix factorization for automated machine learning. In: Advances in Neural Information Processing Systems, pp. 3348–3357 (2018)

    Google Scholar 

  8. Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)

    Google Scholar 

  9. Gui, J., Sun, Z., Wen, Y., Tao, D., Ye, J.: A review on generative adversarial networks: algorithms, theory, and applications (2020)

    Google Scholar 

  10. He, H., Bai, Y., Garcia, E.A., Li, S.: Adasyn: adaptive synthetic sampling approach for imbalanced learning. In: 2008 IEEE International Joint Conference on Neural Networks, pp. 1322–1328 (2008)

    Google Scholar 

  11. Klein, A., Dai, Z., Hutter, F., Lawrence, N., Gonzalez, J.: Meta-surrogate benchmarking for hyperparameter optimization. In: Advances in Neural Information Processing Systems, pp. 6270–6280 (2019)

    Google Scholar 

  12. Klein, A., Falkner, S., Bartels, S., Hennig, P., Hutter, F.: Fast Bayesian optimization of machine learning hyperparameters on large datasets. In: Artificial Intelligence and Statistics, pp. 528–536. PMLR (2017)

    Google Scholar 

  13. Li, Y.F., Wang, H., Wei, T., Tu, W.W.: Towards automated semi-supervised learning. Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 4237–4244 (2019)

    Google Scholar 

  14. Luo, Y., et al.: Autocross: automatic feature crossing for tabular data in real-world applications. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1936–1945 (2019)

    Google Scholar 

  15. Mirza, M., Osindero, S.: Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014)

  16. Mısır, M., Sebag, M.: Alors: an algorithm recommender system. Artif. Intell 244, 291–314 (2017)

    Article  MathSciNet  Google Scholar 

  17. Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  18. Thornton, C., Hutter, F., Hoos, H.H., Leyton-Brown, K.: Auto-weka: combined selection and hyperparameter optimization of classification algorithms. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 847–855 (2013)

    Google Scholar 

  19. Vanschoren, J.: Meta-learning: a survey. arXiv preprint arXiv:1810.03548 (2018)

  20. Vanschoren, J., Van Rijn, J.N., Bischl, B., Torgo, L.: Openml: networked science in machine learning. ACM SIGKDD Expl. Newslett 15(2), 49–60 (2014)

    Article  Google Scholar 

  21. Yang, C., Akimoto, Y., Kim, D.W., Udell, M.: Oboe: Collaborative filtering for automl model selection. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1173–1183 (2019)

    Google Scholar 

  22. Zöller, M.A., Huber, M.F.: Benchmark and survey of automated machine learning frameworks. arXiv preprint arXiv:1904.12054 (2019)

Download references

Acknowledgment

This work is supported by the National Natural Science Foundation of China (No. 52073169) and the State Key Program of National Nature Science Foundation of China (Grant No. 61936001).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yue Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liu, Y., Tian, W., Li, S. (2021). Meta-data Augmentation Based Search Strategy Through Generative Adversarial Network for AutoML Model Selection. In: Karlapalem, K., et al. Advances in Knowledge Discovery and Data Mining. PAKDD 2021. Lecture Notes in Computer Science(), vol 12714. Springer, Cham. https://doi.org/10.1007/978-3-030-75768-7_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-75768-7_25

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-75767-0

  • Online ISBN: 978-3-030-75768-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics