Abstract
The study of evolution has become an important research issue, especially in the last decade, due to a greater awareness of our world’s volatility. As a consequence, a new paradigm has emerged to respond more effectively to a class of new problems in Data Mining. In this paper we address the problem of monitoring the evolution of clusters and propose the MClusT framework, which was developed along the lines of this new Change Mining paradigm. MClusT includes a taxonomy of transitions, a tracking method based in Graph Theory, and a transition detection algorithm. To demonstrate its feasibility and applicability we present real world case studies, using datasets extracted from Banco de Portugal and the Portuguese Institute of Statistics. We also test our approach in a benchmark dataset from TSDL. The results are encouraging and demonstrate the ability of MClusT framework to provide an efficient diagnosis of clusters transitions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bottcher, M., Hoppner, F., Spiliopoulou, M.: On exploiting the power of time in data mining. SIGKDD Explorations (10), 3–11 (2008)
Hampel, F.: Some thoughts about classification. In: 8th Conference of the International Federation of Classification Societies, pp. 1–19. Springer, Poland (2002)
Jain, A.K.: Data Clustering: 50 Years Beyond K-means. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part I. LNCS (LNAI), vol. 5211, pp. 3–4. Springer, Heidelberg (2008)
Jain, A.K., Murty, M.N., Flynn, P.J.: Data Clustering: A Review. ACM Comput. Surv. (31), 264–323 (1999)
Ganti, V., Gehrke, J., Ramakrishnan, R.: A Framework for Measuring Changes in Data Characteristics. In: Proceedings of the 18th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pp. 126–137. ACM Press, Pennsylvania (1999)
Bartolini, I., Ciaccia, P., Ntoutsi, I., Patella, M., Theodoridis, Y.: The Panda framework for Comparing Patterns. Data Knowl. Eng. (68), 244–260 (2009)
Chawathe, S.S., Garcia-Molina, H.: Meaningful Change Detection in Structured Data. In: Peckham, J. (ed.) Proceedings ACM SIGMOD International Conference on Management of Data, pp. 26–37. ACM Press, Arizona (1997)
Spiliopoulou, M., Ntoutsi, I., Theodoridis, Y., Schult, R.: MONIC: modeling and monitoring cluster transitions. In: Eliassi-Rad, T., Ungar, L.H., Craven, M., Gunopulos, D. (eds.) ACM SIGKDD 2006, pp. 706–711. ACM, Philadelphia (2006)
Falkowski, T., Bartelheimer, J., Spiliopoulou, M.: Mining and Visualizing the Evolution of Subgroups in Social Networks. In: IEEE / WIC / ACM International Conference on Web Intelligence, pp. 52–58. IEEE Computer Society, China (2006)
Yang, H., Parthasarathy, S., Mehta, S.: A generalized framework for mining spatio-temporal patterns in scientific data. In: Grossman, R., Bayardo, R.J., Bennett, K.P. (eds.) Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 716–721. ACM, Illinois (2005)
Baron, S., Spiliopoulou, M.: Monitoring Change in Mining Results. In: Kambayashi, Y., Winiwarter, W., Arikawa, M. (eds.) DaWaK 2001. LNCS, vol. 2114, p. 51. Springer, Heidelberg (2001)
Baron, S., Spiliopoulou, M.: Monitoring the Evolution of Web Usage Patterns. In: Berendt, B., Hotho, A., Mladenič, D., van Someren, M., Spiliopoulou, M., Stumme, G. (eds.) EWMF 2003. LNCS (LNAI), vol. 3209, pp. 181–200. Springer, Heidelberg (2004)
Lu, Y.-H., Huaang, Y.: Mining data streams using clustering. In: Proceedings of the 4th International Conference on Machine Learning and Cybernetics, pp. 2079–2083. IEEE Computer Society, China (2005)
Aggarwal, C.C.: On Change Diagnosis in Evolving Data Streams. IEEE Trans. Knowl. Data Eng. (17), 587–600 (2005)
Chen, K., Liu, L.: Detecting the Change of Clustering Structure in Categorical Data Streams. In: Ghosh, J., Lambert, D., Skillicorn, D.B., Srivastava, J. (eds.) Proceedings of the 6th SIAM International Conference on Data Mining. SIAM, USA (2006)
Aggarwal, C.C., Han, J., Wang, J., Yu, P.S.: A Framework for Change Diagnosis of Data Streams. In: Halevy, A.Y., Ives, Z.G., Doan, A. (eds.) Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, pp. 575–586. ACM, California (2003)
O’Callaghan, L., Meyerson, A., Motwani, R., Mishra, N., Guha, S.: Streaming-Data Algorithms for High-Quality Clustering. In: Proceedings of the 18th International Conference on Data Engineering, p. 685. IEEE Computer Society, California (2002)
Elnekave, S., Last, M., Maimon, O.: Incremental Clustering of Mobile Objects. In: ICDE Workshops (2007)
Kalnis, P., Mamoulis, N., Bakiras, S.: On Discovering Moving Clusters in Spatio-temporal Data. In: Bauzer Medeiros, C., Egenhofer, M.J., Bertino, E. (eds.) SSTD 2005. LNCS, vol. 3633, pp. 364–381. Springer, Heidelberg (2005)
Li, T., Ma, S., Ogihara, M.: Entropy-based criterion in categorical clustering. In: Proceedings of the 21th international conference on Machine learning, p. 65. ACM, New York (2004)
Kaur, S., Bhatnagar, V., Mehta, S., Kapoor, S.: Concept Drift in Unlabeled Data Stream. Technical Report, University of Delhi (2009)
Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 53–65 (1987)
Time Series Data Library, http://robjhyndman.com/TSDL/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Oliveira, M., Gama, J. (2010). Bipartite Graphs for Monitoring Clusters Transitions. In: Cohen, P.R., Adams, N.M., Berthold, M.R. (eds) Advances in Intelligent Data Analysis IX. IDA 2010. Lecture Notes in Computer Science, vol 6065. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13062-5_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-13062-5_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13061-8
Online ISBN: 978-3-642-13062-5
eBook Packages: Computer ScienceComputer Science (R0)