Abstract
The goal of multi-objective clustering (MOC) is to decompose a dataset into similar groups maximizing multiple objectives in parallel. In this paper, we provide a methodology, architecture and algorithms that, based on a large set of objectives, derive interesting clusters regarding two or more of those objectives. The proposed architecture relies on clustering algorithms that support plug-in fitness functions and on multi-run clustering in which clustering algorithms are run multiple times maximizing different subsets of objectives that are captured in compound fitness functions. MOC provides search engine type capabilities to users, enabling them to query a large set of clusters with respect to different objectives and thresholds. We evaluate the proposed MOC framework in a case study that centers on spatial co-location mining; the goal is to identify regions in which high levels of Arsenic concentrations are co-located with high concentrations of other chemicals in the Texas water supply.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Saha, S., Bandyopadhyay, S.: A New Multiobjective Simulated Annealing Based Clustering Technique Using Stability And Symmetry. In: 19th International Conference on Pattern Recognition (2008)
Law, H.C.M., Topchy, A.P., Jain, A.K.: Multiobjective Data Clustering. In: IEEE Conputer Society Conference on Computer Vision and Pattern Recognition (2004)
Handl, J., Knowles, J.: Evolutionary Multiobjective Clustering. In: Yao, X., Burke, E.K., Lozano, J.A., Smith, J., Merelo-Guervós, J.J., Bullinaria, J.A., Rowe, J.E., Tiňo, P., Kabán, A., Schwefel, H.-P. (eds.) PPSN 2004. LNCS, vol. 3242, pp. 1081–1091. Springer, Heidelberg (2004)
Jiamthapthaksin, R., Eick, C.F., Rinsurongkawong, V.: An Architecture and Algorithms for Multi-Run Clustering. In: IEEE Computational Intelligence Symposium on Computational Intelligence and Data Mining (2009)
Eick, C.F., Parmar, R., Ding, W., Stepinki, T., Nicot, J.-P.: Finding Regional Co-location Patterns for Sets of Continuous Variables in Spatial Datasets. In: 16th ACM SIGSPATIAL International Conference on Advances in GIS (2008)
Eick, C.F., Vaezian, B., Jiang, D., Wang, J.: Discovery of Interesting Regions in Spatial Datasets Using Supervised Clustering. In: 10th European Conference on Principles and Practice of Knowledge Discovery in Databases (2006)
Choo, J., Jiamthapthaksin, R., Chen, C.-S., Celepcikay, O.C., Giusti, Eick, C.F.: MOSAIC: A Proximity Graph Approach to Agglomerative Clustering: In: 9th International Conference on Data Warehousing and Knowledge Discovery (2007)
Baeck, T., Fogel, D.B., Michalewicz, Z.: Penalty functions, Evolutionary computation 2. In: Advanced algorithms and operators. Institute of Physics Publishing, Philadelphia (2000)
Data Mining and Machine Learning Group website, University of Houston, Texas, http://www.tlc2.uh.edu/dmmlg/Datasets
Texas Water Development Board, http://www.twdb.state.tx.us/home/index.asp
Deb, K., Pratap, A., Agrawal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. J. Evolutionary Computation. 6, 182–197 (2002)
Corne, D.W., Jerram, N.R., Knowles, J.D., Oates, M.J.: PESA-II: Region-based Selection in Evolutionary Multiobjective Optimization. In: Genetic and Evolutionary Computation Conference, pp. 283–290 (2001)
Handl, J., Knowles, J.: An Evolutionary Approach to Multiobjective Clustering. J. Evolutionary Computation. 11, 56–57 (2007)
Faceli, K., de Carvalho, A.C.P.L.F., de Souto, M.C.P.: Multi-Objective Clustering Ensemble. J. Hybrid Intelligent Systems. 4, 145–146 (2007)
Molina, J., Laguna, M., Martí, R., Caballero, R.: SSPMO: A Scatter Search Procedure for Non-Linear Multiobjective Optimization. INFORMS J. Computing 19, 91–100 (2007)
Lin, C.-R., Liu, K.-H., Chen, M.-S.: Dual Clustering: Integrating Data Clustering over Optimization and Constraint Domains. J. Knowledge and Data Engineering 17 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jiamthapthaksin, R., Eick, C.F., Vilalta, R. (2009). A Framework for Multi-Objective Clustering and Its Application to Co-Location Mining. In: Huang, R., Yang, Q., Pei, J., Gama, J., Meng, X., Li, X. (eds) Advanced Data Mining and Applications. ADMA 2009. Lecture Notes in Computer Science(), vol 5678. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03348-3_20
Download citation
DOI: https://doi.org/10.1007/978-3-642-03348-3_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03347-6
Online ISBN: 978-3-642-03348-3
eBook Packages: Computer ScienceComputer Science (R0)