An optimal method for data clustering

Xie, Linsen; Lu, Chengbo; Mei, Ying; Du, Hong; Man, Zhihong

doi:10.1007/s00521-014-1818-3

An optimal method for data clustering

Extreme Learning Machine and Applications
Published: 04 January 2015

Volume 27, pages 283–289, (2016)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Linsen Xie¹,
Chengbo Lu¹,
Ying Mei¹,
Hong Du¹ &
…
Zhihong Man²

496 Accesses
4 Citations
Explore all metrics

Abstract

An algorithm for optimizing data clustering in feature space is studied in this work. Using graph Laplacian and extreme learning machine (ELM) mapping technique, we develop an optimal weight matrix W for feature mapping. This work explicitly performs a mapping of the original data for clustering into an optimal feature space, which can further increase the separability of original data in the feature space, and the patterns points in same cluster are still closely clustered. Our method, which can be easily implemented, gets better clustering results than some popular clustering algorithms, like k-means on the original data, kernel clustering method, spectral clustering method, and ELM k-means on data include three UCI real data benchmarks (IRIS data, Wisconsin breast cancer database, and Wine database).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

A Comprehensive Survey of Clustering Algorithms

Article 01 June 2015

Dongkuan Xu & Yingjie Tian

Puma optimizer (PO): a novel metaheuristic optimization algorithm and its application in machine learning

Article 19 January 2024

Benyamin Abdollahzadeh, Nima Khodadadi, … Seyedali Mirjalili

Data clustering: application and trends

Article 27 November 2022

Gbeminiyi John Oyewole & George Alex Thopil

References

Luxburg UV (2004) A tutorial on spectral clustering. Stat Comput 17(4):395–416
Article Google Scholar
Han J, Kamber M, Pei J (2001) Data mining, concepts and techniques. Morgan Kaufmann, San Francisco
Google Scholar
McQueen J (1967) Some methods for classifications and analysis of multivariate observations. In: The symposium on mathematical statistics and probability vol 1, pp 281–297
Karypis G, Han E-H, Kumar V (1999) Chameleon: hierarchical clustering using dynamic modeling. Computer 32(8):68–75
Article Google Scholar
Rastogi G, Shim K (1998) CURE: an efficient clustering algorithm for large datasets. In: ACM SIGMOD conference, 1998
Defays D (1977) An efficient algorithm for a complete link method. Comput J 20(4):364–366
Article MATH MathSciNet Google Scholar
Ester M, Kriegel H, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial data bases with noise. In: Proceedings of the 2nd international conference on knowledge discovery and data mining, AAAI Press, pp 226–231
Roy S, Bhattacharyya D (2005) An approach to find embedded clusters using density based techniques. In: Distributed computing and internet technology, pp 523–535
Sheikholeslami G, Chatterjee S, Zhang A (1998) Wave cluster: a multi-resolution clustering approach for very large spatial databases. In: The proceedings of the 24th VLDB conference, New York, USA, pp 428–439
Xiong H, Wu J, Chen J (2009) K-means clustering versus validation measures: a data-distribution perspective. IEEE Trans Syst Man Cybern Part B Cybern 39(2):318–331
Article Google Scholar
Vapnik VN (1998) Statistical learning theory. Wiley, New York
MATH Google Scholar
Girolami M (2002) Mercer kernel based clustering in feature space. IEEE Trans Neural Netw 13(3):780–784
Article Google Scholar
Camastra F, Verri A (2005) A novel kernel method for clustering. IEEE Trans Pattern Anal Mach Intell 27(5):801–805
Article Google Scholar
Ng AY, Jordan MI, Weiss Y (2001) On spectral clustering: analysis and an algorithm. Adv Neural Inf Process Syst 14:849–856
Google Scholar
He Q, Jin X, Du C, Zhuang F, Shi Z (2014) Clustering in extreme learning machine feature space. Neurocomputing 128:88–95
Article Google Scholar
Huang GB, Chen L, Siew CK (2006) Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans Neural Netw 17(4):879–892
Article Google Scholar
Huang GB, Wang DH, Lan Y (2011) Extreme learning machines: a survey. Int J Mach Learn Cybern 2(2):107–122
Article Google Scholar
Huang GB, Zhu QY, Siew CK (2004) Extreme learning machine: a new learning scheme of feedforward neural networks. In: Proceedings of international joint conference on neural networks (IJCNN2004), vol 2, Budapest, Hungary, pp 985–990
Man Z, Lee K, Wang DH, Cao Z, Miao C (2011) A new robust training algorithm for a class of single hidden layer neural networks. Neurocomputing 74:2491–2501
Article Google Scholar
Man Z, Lee K, Wang D, Cao Z, Khoo S (2013) An optimal weight learning machine for handwritten digit image recognition. Signal Process 93(6):1624–1638
Article Google Scholar
Belkin M, Matveeva I, Niyogi P (2004) Regularization and semi-supervised learning on large graphs. In: Proceedings of 17th conference on learning theory (COLT), 2004
The IRIS data can be downloaded from the following address: http://archive.ics.uci.edu/ml/datasets/Iris
Wisconsin’s breast cancer database can be downloaded from the following address: http://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Original)
Wine database can be downloaded from the following address: https://archive.ics.uci.edu/ml/datasets/Wine
Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7:179–188
Article Google Scholar
Wolberg WH, Mangasarian OL (1990) Multisurface method of pattern separation for medical diagnosis applied to breast cytology. Proc Natl Acad Sci USA 87:9193–9196
Article MATH Google Scholar

Download references

Acknowledgments

This research was supported by Natural Science Foundation of China under Grant No. 11171137, Zhejiang Provincial Natural Science Foundation of China under Grant No. LY13A010008, and Scientific Research Fund of Zhejiang Provincial Education Department under Grant No. 2014.

Author information

Authors and Affiliations

Department of Mathematics, Lishui University, Lishui, 323000, Zhejiang, China
Linsen Xie, Chengbo Lu, Ying Mei & Hong Du
Faculty of Science, Engineering and Technology, Swinburne University of Technology, Melbourne, VIC, 3122, Australia
Zhihong Man

Authors

Linsen Xie
View author publications
You can also search for this author in PubMed Google Scholar
Chengbo Lu
View author publications
You can also search for this author in PubMed Google Scholar
Ying Mei
View author publications
You can also search for this author in PubMed Google Scholar
Hong Du
View author publications
You can also search for this author in PubMed Google Scholar
Zhihong Man
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chengbo Lu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xie, L., Lu, C., Mei, Y. et al. An optimal method for data clustering. Neural Comput & Applic 27, 283–289 (2016). https://doi.org/10.1007/s00521-014-1818-3

Download citation

Received: 10 September 2014
Accepted: 22 December 2014
Published: 04 January 2015
Issue Date: February 2016
DOI: https://doi.org/10.1007/s00521-014-1818-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

An optimal method for data clustering

Abstract

Access this article

Similar content being viewed by others

A Comprehensive Survey of Clustering Algorithms

Puma optimizer (PO): a novel metaheuristic optimization algorithm and its application in machine learning

Data clustering: application and trends

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An optimal method for data clustering

Abstract

Access this article

Similar content being viewed by others

A Comprehensive Survey of Clustering Algorithms

Puma optimizer (PO): a novel metaheuristic optimization algorithm and its application in machine learning

Data clustering: application and trends

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation