Abstract
Support vector clustering (SVC) is an important boundary-based clustering algorithm in multiple applications for its capability of handling arbitrary cluster shapes. However, SVC’s popularity is degraded by its highly intensive time complexity and poor label performance. To overcome such problems, we present a novel efficient and robust convex decomposition based cluster labeling (CDCL) method based on the topological property of dataset. The CDCL decomposes the implicit cluster into convex hulls and each one is comprised by a subset of support vectors (SVs). According to a robust algorithm applied in the nearest neighboring convex hulls, the adjacency matrix of convex hulls is built up for finding the connected components; and the remaining data points would be assigned the label of the nearest convex hull appropriately. The approach's validation is guaranteed by geometric proofs. Time complexity analysis and comparative experiments suggest that CDCL improves both the efficiency and clustering quality significantly.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Xu R, Wunsch D C. Hierarchical clustering. In Clustering, Hoboken: John Wiley&Sons, 2008.
Ben-Hur A, Horn D, Siegelmann H T, Vapnik V N. Support vector clustering. Journal of Machine Learning Research, 2001, 2: 125–137.
Schölkopf B, Platt J C, Shawe-Taylor J C, Smola A J, Williamson R C. Estimating the support of a high-dimensional distribution. Neural Computation, 2001, 13(7): 1443–1472.
Tax D M J, Duin R P W. Support vector domain description. Pattern Recognition Letters, 1999, 20(11–13): 1191–1199.
Burges C J C. A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 1998, 2(2): 121–167.
Yang J H, Estivill-Castro V, Chalup S K. Support vector clustering through proximity graph modelling. In Proc. the 9th International Conference on Neural Information Processing (ICONIP 2002), Orchid Country Club, Singapore, Nov. 18–22, 2002, pp.898–903.
Lee J, Lee D. An improved cluster labeling method for support vector clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(3): 461–464.
Lee J, Lee D. Dynamic characterization of cluster structures for robust and inductive support vector clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(11): 1869–1874.
Lee D, Lee J. Equilibrium-based support vector machine for semisupervised classi¯cation. IEEE Transactions on Neural Networks, 2007, 18(2): 578–583.
Lee S-H, Daniels K M. Cone cluster labeling for support vector clustering. In Proc. the 6th SIAM Conference on Data Mining, Bethesda, Maryland, Apr. 20–22, 2006, pp.484–488.
Varma C M B S, Asharaf S, Murty M N. Rough core vector clustering. In Proc. the 2nd International Conference on Pattern Recognition and Machine Intelligence, Kolkata, India, Dec. 18–22, 2007, pp.304–310.
Hsieh T W, Taur J S, Tao C W, Kung S Y. A kernel-based core growing clustering method. International Journal of Intelligent Systems, 2009, 24(4): 441–458.
Ling P, Zhou C G, Zhou X. Improved support vector clustering. Eng. Applicat. Artificial Intelligence, 2010, 23(4): 552–559.
Ping Y, Zhou Y J, Yang Y X. A novel scheme for accelerating support vector clustering. Computing and Infomatics, 2012, 31(2): in press.
Jung J H, Kim N, Lee J. Dynamic pattern denoising method using multi-basin system with kernels. Pattern Recognition, 2011, 44(8): 1698–1707.
Vapnik V. The Nature of Statistical Learning Theory. New York: Springer-Verlag, 1995.
Jung K H, Lee D, Lee J. Fast support-based clustering method for large-scale problems. Pattern Recognition, 2010, 43(5): 1975–1983.
Lee D, Lee J. Dynamic dissimilarity measure for support-based clustering. IEEE Transactions on Knowledge and Data Engineering, 2010, 22(6): 900–905.
Ben-Hur A, Horn D, Siegelmann H T, Vapnik V. A support vector cluster method. In Proc. the 15th International Conference on Pattern Recognition, Barcelona, Spain, Sep. 3–7, 2000, pp.724–727.
Estivill-Castro V, Lee I. AMOEBA: Hierarchical clustering based on spatial proximity using Delaunay diagram. In Proc. the 9th Int. Symposium on Spatial Data Handling, Beijing, China, Aug. 10–12, 2000, pp.7a.26–7a.41.
Estivill-Castro V, Lee I, Murray A T. Criteria on proximity graphs for boundary extraction and spatial clustering. In Proc. the 5th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD2001), Hong Kong, China, Apr. 16–18, 2001, pp.348–357.
Ban T, Abe S. Spatially chunking support vector clustering algorithm. In Proc. International Joint Conference on Neural Networks, Budapest, Hungary, July 25–29, 2004, pp.413–418.
Puma-Villanueva W J, Bezerra G B, Lima C A M, Zuben F J V. Improving support vector clustering with ensembles. In Proc. International Joint Conference on Neural Networks, Montreal, Quebec, Canada, Jul. 31–Aug. 4, 2005, pp.13–15.
Chiang J H, Hao P Y. A new kernel-based fuzzy clustering approach: Support vector clustering with cell growing. IEEE Transactions on Fuzzy Systems, 2003, 11(4): 518–527.
Wang F, Zhao B, Zhang C S. Linear time maximum margin clustering. IEEE Transactions on Neural Networks, 2010, 21(2): 319–332.
Peng J M, Mukherjee L, Singh V, Schuurmans D, Xu L L. An efficient algorithm for maximal margin clustering. Journal of Global Optimization, 2011, 2: 1–15. doi:10.1007/s10898-011-9691-4.
Lee S H, Daniels K. Gaussian kernel width selection and fast cluster labeling for support vector clustering. University of Massachusetts, Lowell, Technical Report No. 2005–009, 2005.
Lee C H, Yang H C. Construction of supervised and unsupervised learning systems for multilingual text categorization. Expert Systems with Applications, 2009, 36(2): 2400–2410.
Lee D, Jung K H, Lee J. Constructing sparse kernel machines using attractors. IEEE Transactions on Neural Networks, 2009, 20(4): 721–729.
Hao P Y, Chiang J H, Tu Y K. Hierarchically SVM classification based on support vector clustering method and its application to document categorization. Expert Systems with Applications, 2007, 33(3): 627-635.
Hocking T D, Joulin A, Bach F, Vert J P. Clusterpath: An algorithm for clustering using convex fusion penalties. In Proc. the 28th International Conference on Machine Learning (ICML), Bellevue, WA, USA, Jun. 28–Jul. 2, 2011, pp.1–8.
Shamir O, Tishby N. Stability and model selection in k-means clustering. Machine Learning, 2010, 80(2–3): 213–243.
Lee S H, Daniels K. Gaussian kernel width generator for support vector clustering. In Proc. International Conference on Bioinformatics and Its Applications, Fort Lauderdale, Florida, USA, Dec. 16–19, 2004, pp.151–162.
Boyd S, Vandenberghe L. Convex Optimization. Cambridge, 7th edition, New York: Cambridge University Press, 2009.
Li Y H, Maguire L. Selecting critical patterns based on local geometrical and statistical information. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(6): 1189–1201.
Kpotufe S, von Luxburg U. Pruning nearest neighbor cluster trees. In Proc. the 28th International Conference on Machine Learning (ICML 2011), Bellevue, USA, Jun. 28–Jul. 2, 2011, pp.225–232.
Kim H C, Lee J. Clustering based on gaussian processes. Neural Computation, 2007, 19(11): 3088–3107.
Camastra F, Verri A. A novel kernel method for clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(5): 801–805.
Frank A, Asuncion A. UCI machine learning repository. http://archive.ics.uci.edu/ml, 2010.
Hubert L, Arabie P. Comparing partitions. Journal of Classification, 1985, 2(1): 193-218.
Wu M, Scholkopf B. A local learning approach for clustering. In Proc. the 20th Annual Conference on Advances Neural Information Processing Systems (NIPS 2007), Vancouver, B.C., Canada, Dec. 4–7, 2006, pp.1529–1536.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was supported by the National Natural Science Foundation of China under Grant No. 60972077 and partially under Grant No. 70921061, the National Science and Technology Major Program under Grant No. 2010ZX03003-003-01, the Natural Science Foundation of Beijing under Grant No. 9092009, and the Fundamental Research Funds for the Central Universities under Grant No. 2011RC0212.
Electronic Supplementary Material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Ping, Y., Tian, YJ., Zhou, YJ. et al. Convex Decomposition Based Cluster Labeling Method for Support Vector Clustering. J. Comput. Sci. Technol. 27, 428–442 (2012). https://doi.org/10.1007/s11390-012-1232-1
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11390-012-1232-1