Skip to main content
Log in

Soft subspace clustering with an improved feature weight self-adjustment mechanism

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Traditional clustering algorithms are often defeated by high dimensionality. In order to find clusters hiding in different subspaces, soft subspace clustering has become an effective means of dealing with high dimensional data. However, most existing soft subspace clustering algorithms contain parameters which are difficult to be determined by users in real-world applications. A new soft subspace clustering algorithm named SC-IFWSA is proposed, which uses an improved feature weight self-adjustment mechanism IFWSA to update adaptively the weights of all features for each cluster according to the importance of the features to clustering quality and does not require users to set any parameter values. In addition, SC-IFWSA can overcome the traditional FWSA mechanism which may fail to calculate feature weights in some particular cases. In comparison with its related approaches, the experimental results carried out on ten data sets demonstrate the effectiveness and feasibility of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Steinbach M, Ertöz L, Kumar V (2004) The challenges of clustering high dimensional data. New directions in statistical physics: econophysics, bioinformatics, and pattern recognition, pp 273–308

  2. Han JW, Kamber M (2007) Data mining: concepts and techniques, 2nd edn. China Machine Press, Beijing

    Google Scholar 

  3. Yang Q, Wu X (2006) 10 challenging problems in data mining research. Int J Inform Technol Decis Making 5(4):597–604

    Article  Google Scholar 

  4. Huang JZ, Ng MK, Rong H, Li Z (2005) Automated variable weighting in k-means type clustering. IEEE Trans Pattern Anal Mach Intell 27(5):1–12

    Article  MATH  Google Scholar 

  5. Wang LJ (2010) An improved multiple fuzzy NNC system based on mutual information and fuzzy integral. Int J Mach Learn Cybern 2(1):25–36

    Article  Google Scholar 

  6. Hu QH, Pan W, An S, Ma PJ, Wei JM (2010) An efficient gene selection technique for cancer recognition based on neighborhood mutual information. Int J Mach Learn Cybern 1(1–4):63–74

    Article  Google Scholar 

  7. Shah NH, Shukla KT (2010) Optimal production schedule in declining market for an imperfect production system. Int J Mach Learn Cybern 1(1–4):89–99

    Article  Google Scholar 

  8. Tsai CY, Chiu CC (2008) Developing a feature weight self-adjustment mechanism for a K-means clustering algorithm. Comput Stat Data Anal 52:4658–4672

    Article  MathSciNet  MATH  Google Scholar 

  9. Aggarwal CC, Wolf JL, Yu PS, Procopiuc C, Park JS (1999) Fast algorithm for projected clustering. In: Proceedings of the ACM SIGMOD, pp 61–72

  10. Woo KG, Lee JH, Kim MH, Lee YJ (2004) FINDIT: a fast and intelligent subspace clustering algorithm using dimension voting. Inform Softw Technol 46(4):255–271

    Article  Google Scholar 

  11. Yip KY, Cheung DW, Ng MK (2004) A practical projected clustering algorithm. IEEE Trans Knowl Data Eng 16(11):1387–1397

    Article  Google Scholar 

  12. Parsons L, Haque E, Liu H (2004) Subspace clustering for high dimensional data: a review. ACM SIGKDD Explor Newsl 6(1):90–105

    Article  Google Scholar 

  13. Chan EY, Ching WK, Ng MK, Huang JZ (2004) An optimization algorithm for clustering using weighted dissimilarity measures. Pattern Recogn 37(5):943–952

    Article  MATH  Google Scholar 

  14. Jing L, Ng MK, Huang JZ (2007) An entropy weighting K-means algorithm for subspace clustering of high-dimensional sparse data. IEEE Trans Knowl Data Eng 19(8):1–16

    Article  MATH  Google Scholar 

  15. Domeniconi C, Gunopulos D, Ma S, Yan B, Al-Razgan M, Papadopoulos D (2007) Locally adaptive metrics for clustering high dimensional data. Data Min Knowl Disc 14(1):63–97

    Article  MathSciNet  Google Scholar 

  16. Jing L, Ng MK, Xu J, Huang JZ (2005) Subspace clustering of text documents with feature weighting k-means algorithm. Adv Knowl Discov Data Mining 3518:802–812

    Article  Google Scholar 

  17. Gan G, Wu J, Yang Z (2006) A fuzzy subspace algorithm for clustering high dimensional data. In: Li X, Zaiane O, Li Z (eds) Lecture notes in artificial intelligence 4093. Springer, Berlin, pp 271–278

    Google Scholar 

  18. Gan G, Wu J (2008) A convergence theorem for the fuzzy subspace clustering algorithm. Pattern Recogn 41:1939–1947

    Article  MATH  Google Scholar 

  19. Deng Z, Choi KS, Chung FL, Wang S (2010) Enhanced soft subspace clustering integrating within-cluster and between-cluster information. Pattern Recogn 43:767–781

    Article  MATH  Google Scholar 

  20. Domeniconi C, Papadopoulos D, Gunopulos D, Ma S (2004) Subspace clustering of high dimensional data, In: Proceedings of the SIAM international conference on data mining

  21. Friedman JH, Meulman JJ (2004) Clustering objects on subsets of attributes. J R Stat Soc B 66(4):815–849

    Article  MathSciNet  MATH  Google Scholar 

  22. Frigui H, Nasraoui O (2004) Unsupervised learning of prototypes and attribute weights. Pattern Recogn 37(3):567–581

    Article  Google Scholar 

  23. Witten I, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn edn. Morgan Kaufmann, San Francisc

    MATH  Google Scholar 

  24. Asuncion A, Newman D J (2007) UCI Machine Learning Repository. School of Information and Computer Science, CA: University of California, Irvine. http://www.ics.uci.edu/~mlearn/MLRepository.html

  25. Hulten G, Spencer L, Domingos P (2001) Mining time-changing data streams, In Proc. of ACM International Conference on Knowledge Discovery and Data Mining, ACM Press: 97-106

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China under Grant No. 61070062, the Key Project on the Cooperation of Industry and University of Fujian Province of China under Grant No. 2010H6007, and the Key Scientific Research Project of the Higher Education Institutions of Fujian Province of China under Grant No. JK2009006.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gongde Guo.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Guo, G., Chen, S. & Chen, L. Soft subspace clustering with an improved feature weight self-adjustment mechanism. Int. J. Mach. Learn. & Cyber. 3, 39–49 (2012). https://doi.org/10.1007/s13042-011-0038-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-011-0038-8

Keywords

Navigation