Abstract
Meta-learning techniques can be very useful for supporting non-expert users in the algorithm selection task. In this work, we investigate the use of different components in an unsupervised meta-learning framework. In such scheme, the system aims to predict, for a new learning task, the ranking of the candidate clustering algorithms according to the knowledge previously acquired.
In the context of unsupervised meta-learning techniques, we analyzed two different sets of meta-features, nine different candidate clustering algorithms and two learning methods as meta-learners.
Such analysis showed that the system, using MLP and SVR meta-learners, was able to successfully associate the proposed sets of dataset characteristics to the performance of the new candidate algorithms. In fact, a hypothesis test showed that the correlation between the predicted and ideal rankings were significantly higher than the default ranking method. In this sense, we also could validate the use of the proposed sets of meta-features for describing the artificial learning tasks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Adya, J.A.M., Collopy, F., Kennedy, M.: Automatic identification of time series features for rule-based forecasting. International Journal of Forecasting 17(2), 143–157 (2001)
Aha, D.W.: Generalizing from case studies: A case study. In: Proceedings of the Ninth International Workshop on Machine Learning, pp. 1–10. Morgan Kaufmann, San Francisco (1992)
Brazdil, P.B., Soares, C., Da Costa, J.P.: Ranking learning algorithms: Using ibl and meta-learning on accuracy and time results. Machine Learning 50(3), 251–277 (2003)
Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines (2001), http://www.csie.ntu.edu.tw/~cjlin/libsvm
Engels, R., Theusinger, C.: Using a data metric for preprocessing advice for data mining applications. In: European Conference on Artificial Intelligence, pp. 430–434 (1998)
Ertoz, L., Steinbach, M., Kumar, V.: A new shared nearest neighbor clustering algorithm and its applications. In: Workshop on Clustering High Dimensional Data and its Applications at 2nd SIAM International Conference on Data Mining, pp. 105–115 (2002)
Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Simoudis, E., Han, J., Fayyad, U.M. (eds.) Second International Conference on Knowledge Discovery and Data Mining, pp. 226–231. AAAI Press, Menlo Park (1996)
Handl, J., Knowles, J.: Cluster generators for large high-dimensional data sets with large numbers of clusters (2008), http://dbkgroup.org/handl/generators
Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice Hall, Englewood Cliffs (1988)
Johnson, R.A., Wichern, D.W.: Applied Multivariate Statistical Analysis, 5th edn. Prentice Hall, Englewood Cliffs (2002)
Kalousis, A., Hilario, M.: Feature selection for meta-learning. In: Cheung, D., Williams, G.J., Li, Q. (eds.) PAKDD 2001. LNCS (LNAI), vol. 2035, pp. 222–233. Springer, Heidelberg (2001)
Kalousis, A., Theoraris, T.: Noemon: Design, implementation and performance results of an intelligent assistant for classifier selection. Intelligent Data Analysis 3(5), 319–337 (1999)
Kalousis, A., Gama, J., Hilario, M.: On data and algorithms: Understanding inductive performance. Machine Learning 54(3), 275–312 (2004)
Kalousis, A., Hilario, M.: Representational issues in meta-learning. In: ICML, pp. 313–320 (2003)
Michie, D., Spiegelhalter, D.J., Taylor, C.C., Campbell, J.: Machine learning, neural and statistical classification. Ellis Horwood, Upper Saddle River (1994)
Pelleg, D., Moore, A.W.: X-means: Extending k-means with efficient estimation of the number of clusters. In: Seventeenth International Conference on Machine Learning, pp. 727–734. Morgan Kaufmann, San Francisco (2000)
Prudêncio, R.B.C., Ludermir, T.B., de A.T. de Carvalho, F.: A modal symbolic classifier for selecting time series models. Pattern Recognition Letters 25(8), 911–921 (2004)
Prudêncio, R.B.C., Ludermir, T.B.: Meta-learning approaches to selecting time series models. Neurocomputing 61, 121–137 (2004)
Chen, P.H., Fan, R.E., Lin, C.J.: Working set selection using the second order information for training svm. Journal of Machine Learning Research 6, 1889–1918 (2005)
Soares, R.G.F.: The use of meta-learning techniques for selecting and ranking clustering algorithms applied to gene expression data (in portuguese). Master’s thesis, Federal University of Pernambuco - Center of Informatics (2008)
Souto, M.C.P., Prudêncio, R.B., Soares, R.G.F., Araújo, D.A.S., Filho, I.G.C., Ludermir, T.B., Schliep, A.: Ranking and selecting clustering algorithms using a meta-learning approach. In: IEEE (ed.) Proceedings of International Joint Conference on Neural Networks, pp. 3729–3735 (2008)
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
Xu, R., Wunsch, D.: Survey of clustering algorithms. IEEE Transactions on Neural Networks 16(3), 645–678 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Soares, R.G.F., Ludermir, T.B., De Carvalho, F.A.T. (2009). An Analysis of Meta-learning Techniques for Ranking Clustering Algorithms Applied to Artificial Data. In: Alippi, C., Polycarpou, M., Panayiotou, C., Ellinas, G. (eds) Artificial Neural Networks – ICANN 2009. ICANN 2009. Lecture Notes in Computer Science, vol 5768. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04274-4_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-04274-4_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04273-7
Online ISBN: 978-3-642-04274-4
eBook Packages: Computer ScienceComputer Science (R0)