Abstract
The topic of this work is the presentation of a novel clustering methodology based on instance similarity in two or more attribute layers. The work is motivated by multi-view clustering and redescription mining algorithms. In our approach we do not construct descriptions of subsets of instances and we do not use conditional independence assumption of different views. We do bottom up merging of clusters only if it enables reduction of an example variability score for all layers. The score is defined as a two component sum of squared deviates of example similarity values. For a given set of instances, the similarity values are computed by execution of an artificially constructed supervised classification problem. As a final result we identify a small but coherent clusters. The methodology is illustrated on a real life discovery task aimed at identification of relevant subgroups of countries with similar trading characteristics in respect of the type of commodities they export.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Gan, G., Ma, C., Wu, J.: Data Clustering: Theory, Algorithms, and Applications. Society for Industrial and Applied Mathematics (2007)
Cha, S.H.: Comprehensive Survey on Distance/Similarity Measures between Probability Density Functions. International Journal of Mathematical Models and Methods in Applied Sciences 1, 300–307 (2007)
Kumar, V., Chhabra, J.K., Kumar, D.: Impact of Distance Measures on the Performance of Clustering Algorithms. Intelligent Computing, Networking, and Informatics 243, 183–190 (2014)
Sun, S.: A survey of multi-view machine learning. Neural Computing and Applications 23, 2031–2038 (2013)
Bickel, S., Scheffer, T.: Multiview clustering. In: Proc. of the Fourth IEEE Int. Conf. on Data Mining, pp. 19–26 (2004)
Parida, L., Ramakrishnan, N.: Redescription mining: Structure theory and algorithms. In: Proc. of Association for the Advancement of Artificial Intelligence (AAAI 2005), pp. 837–844 (2005)
Galbrun, E., Miettinen, P.: From black and white to full color: extending redescription mining outside the boolean world. In: Statistical Analysis and Data Mining, pp. 284–303 (2012)
Caldarelli, G.: Scale-Free Networks: Complex Webs in Nature and Technology. Oxford University Press (2007)
Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)
Pfahringer, B., Holmes, G., Wang, C.: Millions of random rules. In: Proc. of the Workshop on Advances in Inductive Rule Learning, 15th European Conference on Machine Learning, ECML (2004)
UNCTAD database, http://unctadstat.unctad.org/
World Bank, http://data.worldbank.org/indicator
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Gamberger, D., Mihelčić, M., Lavrač, N. (2014). Multilayer Clustering: A Discovery Experiment on Country Level Trading Data. In: Džeroski, S., Panov, P., Kocev, D., Todorovski, L. (eds) Discovery Science. DS 2014. Lecture Notes in Computer Science(), vol 8777. Springer, Cham. https://doi.org/10.1007/978-3-319-11812-3_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-11812-3_8
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11811-6
Online ISBN: 978-3-319-11812-3
eBook Packages: Computer ScienceComputer Science (R0)