Skip to main content
Log in

Kernel-Based Methods to Identify Overlapping Clusters with Linear and Nonlinear Boundaries

  • Published:
Journal of Classification Aims and scope Submit manuscript

Abstract

Detecting overlapping structures and identifying non-linearly-separable clusters with complex shapes are two major issues in clustering. This paper presents two kernel based methods that produce overlapping clusters with both linear and nonlinear boundaries. To improve separability of input patterns, we used for both methods Mercer kernel technique. First, we propose Kernel Overlapping K-means I (KOKMI), a centroid based method, generalizing kernel K-means to produce nondisjoint clusters with nonlinear separations. Second, we propose Kernel Overlapping K-means II (KOKMII), a medoid based method improving the previous method in terms of efficiency and complexity. Experiments performed on non-linearly-separable and real multi-labeled data sets show that proposed learning methods outperform the existing ones.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • ALIGULIYEV, R.M. (2009), “Clustering of Document Collection - AWeighting Approach,” Expert Systems with Applications, 36, 7904–7916.

  • BANERJEE, A., KRUMPELMAN, C., BASU, S.,MOONEY, R.J., and GHOSH, J. (2005), “Model Based Overlapping Clustering,” in International Conference on Knowledge Discovery and Data Mining, Chicago, USA, pp. 532–537.

  • BARLA, A., ODONE, F., and VERRI,A. (2003), “Histogram Intersection Kernel for Image Classification,” in 2003 International Conference on Image Processing (ICIP), pp. 513–516.

  • BEN-HUR, A., HORN, D., SIEGELMANN, H.T., and VAPNIK, V. (2001), “Support Vector Clustering,” Journal Of Machine Learning Research, 2, 125–137.

  • CAMASTRA, F., and VERRI, A. (2005), “A Novel Kernel Method for Clustering,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 27, 801–804.

  • CEULEMANS, E., and KIERS, H.A. (2006), “Selecting Among Three-Mode Principal Component Models of Different Types and Complexities: A Numerical Convex Hull Based Method,” British Journal ofMathematical and Statistical Psychology, 59, 133–150.

  • CHAO-LIU, Y., WU, C., and LIU, M. (2011), “Research of Fast SOM Clustering for Text Information,” Expert Systems with Applications, 38, 9325–9333.

  • CLEUZIOU, G. (2008), “An Extended Version of the K-means Method for Overlapping Clustering,” in International Conference on Pattern Recognition (ICPR), Florida, USA: IEEE, pp. 1–4.

  • CLEUZIOU, G. (2010), “Two Variants of the OKM for Overlapping Clustering,” in Advances in Knowledge Discovery and Management, eds. F. Guillet et al., Heidelberg: Springer, pp.149–166.

  • CLEUZIOU, G. (2013), “Osom: A Method for Building Overlapping Topological Maps,” Pattern Recognition Letters, 34, 239–246.

  • CORTES, C., and VAPNIK, V. (1995), “Support Vector Networks,” Machine Learning, 20, 273–297.

  • CRISTIANINI, N., CAMPBELL, C., and BURGES, C. (2002), “Editorial: Kernel Methods: Current Research and Future Directions,” Machine Learning, 46, 5–9.

  • DEODHAR, M., and GHOSH, J. (2006), “Consensus Clustering for Detection of Overlapping Clusters in Microarray Data,” in International Conference on Data Mining, Los Alamitos, CA, USA: IEEE Computer Society, pp. 104–108.

  • DEPRIL, D., VAN MECHELEN, I., and WILDERJANS, T.F. (2012), “Lowdimensional Additive Overlapping Clustering,” Journal of Classification, 29, 297–320.

  • DEPRIL, D., VAN MECHELEN, I., and MIRKIN, B. (2008), “Algorithms for Additive Clustering of Rectangular Data Tables,” Computational Statistics and Data Analysis, 52, 4923–4938.

  • DESARBO, W., and CRON, W. (1988), “A Maximum Likelihood Methodology for Clusterwise Linear Regression,” Journal of Classification, 5, 249–282.

  • FELLOWS, M.R., GUO, J., KOMUSIEWICZ, C., NIEDERMEIER, R., and UHLMANN, J. (2011), “Graph-based Data Clustering with Overlaps,” Discrete Optimization, 8, 2–17.

  • FILIPPONE, M., CAMASTRA, F., MASULLI, F., and ROVETTA, S. (2008), “A Survey of Kernel and Spectral Methods for Clustering,” Pattern Recognition, 41, 176–190.

  • GIROLAMI, M. (2002), “Mercer Kernel-Based Clustering in Feature Space,” IEEE Transactions on Neural Networks, 13, 780–784.

  • GRAEPEL, T., and OBERMAYER, K. (1998), “Fuzzy Topographic Kernel Clustering,” in Proceedings of the Fifth GI Workshop Fuzzy Neuro Systems, pp. 90–97.

  • INOKUCHI, R., and MIYAMOTO, S. (2004), “LVQ Clustering and SOM Using a Kernel Function,” in Proceedings of IEEE International Conference on Fuzzy Systems, pp. 1497–1500.

  • LINGRAS, P., and WEST, C. (2004), “Interval Set Clustering of Web Users with Rough K-Means,” Journal of Intelligent Information Systems, 23, 5–16.

  • LODHI, H., CRISTIANINI, N., SHAWE-TAYLOR, J., and WATKINS, C. (2001), “Text Classication Using String Kernel,” Journal of Machine Learning Research, 2, 419–444.

  • LU, H., HONG, Y., STREET, W., WANG, F., and TONG, H. (2012), “Overlapping Clustering with Sparseness Constraints,” in IEEE 12th International Conference on Data Mining Workshops (ICDMW), pp. 486–494.

  • MASSON, M.-H. and DENUX, T. (2008), “ECM: An Evidential Version of the fuzzy c means Algorithm,” Pattern Recognition, 41, 1384–1397.

  • MIRKIN, B.G. (1987a), “Additive Clustering and Qualitative Factor Analysis Methods for Similarity Matrices,” Journal of Classification, 4, 7–31.

  • MIRKIN, B.G. (1987b), “Method of Principal Cluster Analysis,” Automation and Remote Control, 48, 1379–1386.

  • MIRKIN, B.G. (1990), “A Sequential Fitting Procedure for Linear Data Analysis Models,” Journal of Classification, 7, 167–195.

  • PÉEREZ-SUÁREZ, A., MARTĺNEZ-TRINIDAD, J.F., CARRASCO-OCHOA, J.A., and MEDINA-PAGOLA, J.E. (2013), “OClustR: A New Graph-Based Algorithm for Overlapping Clustering,” Neurocomputing, 109, 1–14.

  • QINAND, A.K., and SUGANTHAN, P.N. (2004), “Kernel Neural Gas Algorithms with Application to Cluster Analysis,” International Conference on Pattern Recognition, 4, 617–620.

  • SCHÖLKOPF, B., SMOLA, A., and MÜLLER, K.-R. (1998), “Nonlinear Component Analysis as a Kernel Eigenvalue Problem,” Neural Computation, 10, 1299–1319.

  • SNOEK, C.G.M., WORRING, M., VAN GEMERT, J.C., GEUSEBROEK, J.-M., and SMEULDERS, A.W.M. (2006), “The Challenge Problem for Automated Detection of 101 Semantic Concepts in Multimedia,” in Proceedings of the 14th annual ACM international conference on Multimedia, New York, USA: ACM, MULTIMEDIA ’06, pp. 421–430.

  • TANG, L., and LIU, H. (2009), “Scalable Learning of Collective Behavior Based on Sparse Social Dimensions,” in Proceedings of the 18th ACM Conference on Information and Knowledge Management, pp. 1107–1116.

  • TROHIDIS, K., TSOUMAKAS, G., KALLIRIS, G., and VLAHAVAS, I.P. (2008), “Multi-Label Classification of Music into Emotions,” in International Conference on Music Information Retrieval (ISMIR), pp. 325–330.

  • VAN HATTUM, P., and HOIJTINK, H. (2009), “Market Segmentation Using Brand Strategy Research: Bayesian Inference with Respect to Mixtures of Log-Linear Models,” Journal of Classification, 26, 297–328.

  • WANG, Q., and FLEURY, E. (2011), “Uncovering Overlapping Community Structure,” in Complex Networks, Vol. 116 of Communications in Computer and Information Science, pp. 176–186.

  • WANG, X., TANG, L., GAO, H., and LIU, H. (2010), “Discovering Overlapping Groups in Social Media,” in Proceedings of the 2010 IEEE International Conference on Data Mining, pp. 569–578.

  • WIECZORKOWSKA, A., SYNAK, P., and RAS, Z. (2006), “Multi-Label Classification of Emotions in Music,” in Intelligent Information Processing and Web Mining, Vol. 35 of Advances in Soft Computing, pp. 307–315.

  • WILDERJANS, T., CEULEMANS, E., VAN MECHELEN, I., and DEPRIL, D. (2011), “ADPROCLUS: A Graphical User Interface For Fitting Additive Profile Clustering Models to Object by Variable Data Matrices,” Behavior Research Methods, 43, 56–65.

  • WILDERJANS, T.F., DEPRIL, D., and VAN MECHELEN, I. (2012), “Additive Biclustering: A Comparison of One New and Two Existing ALS Algorithms,” Journal of Classification, 30, 56–74.

  • WILDERJANS, T.F., CEULEMANS, E., and MEERS, K. (2013), “CHull: A Generic Convex Hull Based Model Selection Method,” Behavior Research Methods, 45, 1–15.

  • WU, Z., XIE, W., and YU, J. (2003), “Fuzzy C-Means Clustering Algorithm Based on Kernel Method,” in Proceedings of the 5th International Conference on Computational Intelligence and Multimedia Applications (ICCIMA), Washington, DC, USA: IEEE Computer Society.

  • ZHANG, D., and CHEN, S. (2002), “Fuzzy Clustering Using Kernel Method,” in International Conference on Control and Automation, Xiamen, China, pp. 123–127.

  • ZHANG, D., and CHEN, S. (2003), “Kernel-based Fuzzy and Possibilistic C-means Clustering,” in International Conference on Artificial Neural Networks (ICANN), Istanbul, Turkey, pp. 122–125.

  • ZHANG, D., and CHEN, S. (2004), “A Novel Kernelized Fuzzy C-means Algorithm with Application in Medical Image Segmentation,” Artificial Intelligence in Medicine, 32, 37–50.

  • ZHANG, S., WANG, R.-S., and ZHANG, X.-S. (2007), “Identification of Overlapping Community Structure in Complex Networks Using Fuzzy C-means Clustering,” Physica A: Statistical Mechanics and its Applications, 374, 483–490.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chiheb-Eddine Ben N’Cir.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

N’Cir, CE.B., Essoussi, N. & Limam, M. Kernel-Based Methods to Identify Overlapping Clusters with Linear and Nonlinear Boundaries. J Classif 32, 176–211 (2015). https://doi.org/10.1007/s00357-015-9181-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00357-015-9181-3

Keywords

Navigation