Time-Series in Hyper-parameter Initialization of Machine Learning Techniques

Horváth, Tomáš; Mantovani, Rafael G.; de Carvalho, André C. P. L. F.

doi:10.1007/978-3-030-91608-4_25

Time-Series in Hyper-parameter Initialization of Machine Learning Techniques

Conference paper
First Online: 23 November 2021

1494 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 13113))

Abstract

Initializing the hyper-parameters (HPs) of machine learning (ML) techniques became an important step in the area of automated ML (AutoML). The main premise in HP initialization is that a HP setting that performs well for a certain dataset(s) will also be suitable for a similar dataset. Thus, evaluation of similarities of datasets based on their characteristics, named meta-features (MFs), is one of the basic tasks in meta-learning (MtL), a subfield of AutoML. Several types of MFs were developed from which those based on principal component analysis (PCA) are, despite their good descriptive characteristics and relatively easy computation, utilized only marginally. A novel approach to HP initialization combining dynamic time warping (DTW), a well-known similarity measure for time series, with PCA MFs is proposed in this paper which does not need any further settings. Exhaustive experiments, conducted for the use-cases of HP initialization of decision trees and support vector machines show the potential of the proposed approach and encourage further investigation in this direction.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
Since a meta-model is learned by traditional ML techniques.
2.
According to the vector \(\mathbf {m}^{c} = (\vartheta _1, \vartheta _2, \dots , \vartheta _c)\), \(d = min \{i\, | 1\le i \le c, \,\vartheta _i\ge 0.95\}\)), and c is the number of attributes in the dataset.
3.
Basically, a 10-bin histogram of the values of \(\mathbf {m}^{pca}\) from the Eq. 1.
4.
We are using all the eight MF types described above as well as their different combinations in our experiments as baselines.
5.
In case of MF vectors of the same size (e.g. \(\mathbf {m}^{his}\) and \(\mathbf {m}^{cup}\)) we were experimenting with well-known vector similarity measures such as Euclidean distance, inner product, cosine similarity and Pearson correlation.
6.
This study uses a simple average as an aggregation function, however, any other aggregation function, like a weighted average, can be used, too.
7.
http://archive.ics.uci.edu/ml/.
8.
Where either all or none of the MFs belonging to a certain MF type were present in a MF vector, according to the given combination of MF types.
9.
This was done because of the stochastic nature of the used HP tuning algorithms (SMBO, PSO and RS), thus, to get more accurate statistics about the performance of the used approaches.
10.
https://github.com/rgmantovani/TimeSeriesHPInitialization.

References

Amasyali, M.F., Ersoy, O.K.: A study of meta learning for regression. Technical report, ECE Technical Reports. Paper 386, Purdue e-Pubs, Purdue University (2009)
Google Scholar
Bardenet, R., Brendel, M., Kégl, B., Sebag, M.: Collaborative hyperparameter tuning. In: Proceedings of the 30th International Conference on Machine Learning, vol. 28, pp. 199–207. JMLR Workshop and Conference Proceedings (2013)
Google Scholar
Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 281–305 (2012)
MathSciNet MATH Google Scholar
Berndt, D.J., Clifford, J.: Using dynamic time warping to find patterns in time series. In: AAAI Workshop on Knowledge Discovery in Databases, pp. 359–370 (1994)
Google Scholar
Brazdil, P., Giraud-Carrier, C., Soares, C., Vilalta, R.: Metalearning: Applications to Data Mining, 1st edn. Springer, Heidelberg (2009)
Book Google Scholar
Brodersen, K.H., Ong, C.S., Stephan, K.E., Buhmann, J.M.: The balanced accuracy and its posterior distribution. In: Proceedings of the 2010 20th International Conference on Pattern Recognition, pp. 3121–3124. IEEE Computer Society (2010)
Google Scholar
Feurer, M., Klein, A., Eggensperger, K., Springenberg, J., Blum, M., Hutter, F.: Efficient and robust automated machine learning. In: Advances in Neural Information Processing Systems 28, pp. 2944–2952. Curran Associates, Inc. (2015)
Google Scholar
Feurer, M., Springenberg, J.T., Hutter, F.: Using meta-learning to initialize bayesian optimization of hyperparameters. In: International Workshop on Meta-learning and Algorithm Selection co-located with 21st European Conference on Artificial Intelligence, pp. 3–10 (2014)
Google Scholar
Janssens, J.H.: Outlier Selection and One-Class Classification. Ph.D. thesis, Tilburg University, Netherlands (2013), TiCC PhD Dissertation Series No.27
Google Scholar
Mantovani, R.G., Rossi, A.L.D., Vanschoren, J., Bischl, B., de Carvalho, A.C.P.L.F.: To tune or not to tune: Recommending when to adjust svm hyper-parameters via meta-learning. In: 2015 International Joint Conference on Neural Networks, pp. 1–8 (2015)
Google Scholar
Mantovani, R.G., Horváth, T., Cerri, R., Junior, S.B., de Vanschoren, J., de Carvalho, L.F.: An empirical study on hyperparameter tuning of decision trees, A.C.P. (2019)
Google Scholar
Ratanamahatana, A., Keogh, E.: Everything you know about dynamic time warping is wrong. In: 3rd Workshop on Mining Temporal and Sequential Data, in conjunction with 10th ACM SIGKDD International Conference Knowledge Discovery and Data Mining (2004)
Google Scholar
Sharma, A., Paliwal, K.K.: Fast principal component analysis using fixed-point algorithm. Pattern Recogn. Lett. 28(10), 1151–1155 (2007)
Article Google Scholar
Snoek, J., Larochelle, H., Adams, R.P.: Practical bayesian optimization of machine learning algorithms. In: Pereira, F., Burges, C., Bottou, L., Weinberger, K. (eds.) Advances in Neural Information Processing Systems 25, pp. 2951–2959. Curran Associates, Inc. (2012)
Google Scholar
Wistuba, M., Schilling, N., Schmidt-Thieme, L.: Hyperparameter search space pruning – a new component for sequential model-based hyperparameter optimization. In: Appice, A., Rodrigues, P.P., Santos Costa, V., Gama, J., Jorge, A., Soares, C. (eds.) ECML PKDD 2015. LNCS (LNAI), vol. 9285, pp. 104–119. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23525-7_7
Chapter Google Scholar
Yang, X.S., Cui, Z., Xiao, R., Gandomi, A.H., Karamanoglu, M.: Swarm Intelligence and Bio-Inspired Computation: Theory and Applications, 1st edn. Elsevier Science Publishers B. V. (2013)
Google Scholar

Download references

Acknowledgment

“Application Domain Specific Highly Reliable IT Solutions” project has been implemented with the support provided from the National Research, Development and Innovation Fund of Hungary, financed under the Thematic Excellence Programme TKP2020-NKA-06 (National Challenges Subprogramme) funding scheme.

Author information

Authors and Affiliations

Faculty of Informatics, ELTE - Eötvös Loránd University, Pázmány Péter sétány 1/C, Budapest, 1117, Hungary
Tomáš Horváth
Federal Technology University - Paraná, Campus of Apucarana, Rua Marcílio Dias, 635 - Jardim Paraíso, Apucarana, PR, 86812-460, Brazil
Rafael G. Mantovani
Institute of Mathematical and Computer Sciences, University of São Paulo, Avenida Trabalhador São Carlense, 400 - Centro, São Carlos, SP, 13566-590, Brazil
André C. P. L. F. de Carvalho
Institute of Computer Science, Faculty of Science, Pavol Jozef Šafárik University, Jesenná 5, 040 01, Košice, Slovakia
Tomáš Horváth

Authors

Tomáš Horváth
View author publications
You can also search for this author in PubMed Google Scholar
Rafael G. Mantovani
View author publications
You can also search for this author in PubMed Google Scholar
André C. P. L. F. de Carvalho
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tomáš Horváth .

Editor information

Editors and Affiliations

University of Manchester, Manchester, UK
Hujun Yin
Universidad Politecnica de Madrid, Madrid, Spain
David Camacho
University of Birmingham, Birmingham, UK
Peter Tino
University of Manchester, Manchester, UK
Richard Allmendinger
University of Huelva, Huelva, Spain
Antonio J. Tallón-Ballesteros
Southern University of Science and Technology, Shenzhen, China
Ke Tang
Yonsei University, Seoul, Korea (Republic of)
Sung-Bae Cho
University of Minho, Braga, Portugal
Paulo Novais
NOVA University of Lisbon, Lisbon, Portugal
Susana Nascimento

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Horváth, T., Mantovani, R.G., de Carvalho, A.C.P.L.F. (2021). Time-Series in Hyper-parameter Initialization of Machine Learning Techniques. In: Yin, H., et al. Intelligent Data Engineering and Automated Learning – IDEAL 2021. IDEAL 2021. Lecture Notes in Computer Science(), vol 13113. Springer, Cham. https://doi.org/10.1007/978-3-030-91608-4_25

Download citation

DOI: https://doi.org/10.1007/978-3-030-91608-4_25
Published: 23 November 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-91607-7
Online ISBN: 978-3-030-91608-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics