Skip to main content
Log in

An optimal method for data clustering

  • Extreme Learning Machine and Applications
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

An algorithm for optimizing data clustering in feature space is studied in this work. Using graph Laplacian and extreme learning machine (ELM) mapping technique, we develop an optimal weight matrix W for feature mapping. This work explicitly performs a mapping of the original data for clustering into an optimal feature space, which can further increase the separability of original data in the feature space, and the patterns points in same cluster are still closely clustered. Our method, which can be easily implemented, gets better clustering results than some popular clustering algorithms, like k-means on the original data, kernel clustering method, spectral clustering method, and ELM k-means on data include three UCI real data benchmarks (IRIS data, Wisconsin breast cancer database, and Wine database).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Luxburg UV (2004) A tutorial on spectral clustering. Stat Comput 17(4):395–416

    Article  Google Scholar 

  2. Han J, Kamber M, Pei J (2001) Data mining, concepts and techniques. Morgan Kaufmann, San Francisco

    Google Scholar 

  3. McQueen J (1967) Some methods for classifications and analysis of multivariate observations. In: The symposium on mathematical statistics and probability vol 1, pp 281–297

  4. Karypis G, Han E-H, Kumar V (1999) Chameleon: hierarchical clustering using dynamic modeling. Computer 32(8):68–75

    Article  Google Scholar 

  5. Rastogi G, Shim K (1998) CURE: an efficient clustering algorithm for large datasets. In: ACM SIGMOD conference, 1998

  6. Defays D (1977) An efficient algorithm for a complete link method. Comput J 20(4):364–366

    Article  MATH  MathSciNet  Google Scholar 

  7. Ester M, Kriegel H, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial data bases with noise. In: Proceedings of the 2nd international conference on knowledge discovery and data mining, AAAI Press, pp 226–231

  8. Roy S, Bhattacharyya D (2005) An approach to find embedded clusters using density based techniques. In: Distributed computing and internet technology, pp 523–535

  9. Sheikholeslami G, Chatterjee S, Zhang A (1998) Wave cluster: a multi-resolution clustering approach for very large spatial databases. In: The proceedings of the 24th VLDB conference, New York, USA, pp 428–439

  10. Xiong H, Wu J, Chen J (2009) K-means clustering versus validation measures: a data-distribution perspective. IEEE Trans Syst Man Cybern Part B Cybern 39(2):318–331

    Article  Google Scholar 

  11. Vapnik VN (1998) Statistical learning theory. Wiley, New York

    MATH  Google Scholar 

  12. Girolami M (2002) Mercer kernel based clustering in feature space. IEEE Trans Neural Netw 13(3):780–784

    Article  Google Scholar 

  13. Camastra F, Verri A (2005) A novel kernel method for clustering. IEEE Trans Pattern Anal Mach Intell 27(5):801–805

    Article  Google Scholar 

  14. Ng AY, Jordan MI, Weiss Y (2001) On spectral clustering: analysis and an algorithm. Adv Neural Inf Process Syst 14:849–856

    Google Scholar 

  15. He Q, Jin X, Du C, Zhuang F, Shi Z (2014) Clustering in extreme learning machine feature space. Neurocomputing 128:88–95

    Article  Google Scholar 

  16. Huang GB, Chen L, Siew CK (2006) Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans Neural Netw 17(4):879–892

    Article  Google Scholar 

  17. Huang GB, Wang DH, Lan Y (2011) Extreme learning machines: a survey. Int J Mach Learn Cybern 2(2):107–122

    Article  Google Scholar 

  18. Huang GB, Zhu QY, Siew CK (2004) Extreme learning machine: a new learning scheme of feedforward neural networks. In: Proceedings of international joint conference on neural networks (IJCNN2004), vol 2, Budapest, Hungary, pp 985–990

  19. Man Z, Lee K, Wang DH, Cao Z, Miao C (2011) A new robust training algorithm for a class of single hidden layer neural networks. Neurocomputing 74:2491–2501

    Article  Google Scholar 

  20. Man Z, Lee K, Wang D, Cao Z, Khoo S (2013) An optimal weight learning machine for handwritten digit image recognition. Signal Process 93(6):1624–1638

    Article  Google Scholar 

  21. Belkin M, Matveeva I, Niyogi P (2004) Regularization and semi-supervised learning on large graphs. In: Proceedings of 17th conference on learning theory (COLT), 2004

  22. The IRIS data can be downloaded from the following address: http://archive.ics.uci.edu/ml/datasets/Iris

  23. Wisconsin’s breast cancer database can be downloaded from the following address: http://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Original)

  24. Wine database can be downloaded from the following address: https://archive.ics.uci.edu/ml/datasets/Wine

  25. Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7:179–188

    Article  Google Scholar 

  26. Wolberg WH, Mangasarian OL (1990) Multisurface method of pattern separation for medical diagnosis applied to breast cytology. Proc Natl Acad Sci USA 87:9193–9196

    Article  MATH  Google Scholar 

Download references

Acknowledgments

This research was supported by Natural Science Foundation of China under Grant No. 11171137, Zhejiang Provincial Natural Science Foundation of China under Grant No. LY13A010008, and Scientific Research Fund of Zhejiang Provincial Education Department under Grant No. 2014.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chengbo Lu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xie, L., Lu, C., Mei, Y. et al. An optimal method for data clustering. Neural Comput & Applic 27, 283–289 (2016). https://doi.org/10.1007/s00521-014-1818-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-014-1818-3

Keywords

Navigation