Soft subspace clustering with an improved feature weight self-adjustment mechanism

Guo, Gongde; Chen, Si; Chen, Lifei

doi:10.1007/s13042-011-0038-8

Soft subspace clustering with an improved feature weight self-adjustment mechanism

Original Article
Published: 03 August 2011

Volume 3, pages 39–49, (2012)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Gongde Guo^1,2,
Si Chen^1,2 &
Lifei Chen^1,2

457 Accesses
23 Citations
Explore all metrics

Abstract

Traditional clustering algorithms are often defeated by high dimensionality. In order to find clusters hiding in different subspaces, soft subspace clustering has become an effective means of dealing with high dimensional data. However, most existing soft subspace clustering algorithms contain parameters which are difficult to be determined by users in real-world applications. A new soft subspace clustering algorithm named SC-IFWSA is proposed, which uses an improved feature weight self-adjustment mechanism IFWSA to update adaptively the weights of all features for each cluster according to the importance of the features to clustering quality and does not require users to set any parameter values. In addition, SC-IFWSA can overcome the traditional FWSA mechanism which may fail to calculate feature weights in some particular cases. In comparison with its related approaches, the experimental results carried out on ten data sets demonstrate the effectiveness and feasibility of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Comprehensive Survey of Anomaly Detection Algorithms

Article 26 November 2021

K-Means algorithm based on multi-feature-induced order

Article 09 April 2024

SSCNet: learning-based subspace clustering

Article Open access 08 April 2024

References

Steinbach M, Ertöz L, Kumar V (2004) The challenges of clustering high dimensional data. New directions in statistical physics: econophysics, bioinformatics, and pattern recognition, pp 273–308
Han JW, Kamber M (2007) Data mining: concepts and techniques, 2nd edn. China Machine Press, Beijing
Google Scholar
Yang Q, Wu X (2006) 10 challenging problems in data mining research. Int J Inform Technol Decis Making 5(4):597–604
Article Google Scholar
Huang JZ, Ng MK, Rong H, Li Z (2005) Automated variable weighting in k-means type clustering. IEEE Trans Pattern Anal Mach Intell 27(5):1–12
Article MATH Google Scholar
Wang LJ (2010) An improved multiple fuzzy NNC system based on mutual information and fuzzy integral. Int J Mach Learn Cybern 2(1):25–36
Article Google Scholar
Hu QH, Pan W, An S, Ma PJ, Wei JM (2010) An efficient gene selection technique for cancer recognition based on neighborhood mutual information. Int J Mach Learn Cybern 1(1–4):63–74
Article Google Scholar
Shah NH, Shukla KT (2010) Optimal production schedule in declining market for an imperfect production system. Int J Mach Learn Cybern 1(1–4):89–99
Article Google Scholar
Tsai CY, Chiu CC (2008) Developing a feature weight self-adjustment mechanism for a K-means clustering algorithm. Comput Stat Data Anal 52:4658–4672
Article MathSciNet MATH Google Scholar
Aggarwal CC, Wolf JL, Yu PS, Procopiuc C, Park JS (1999) Fast algorithm for projected clustering. In: Proceedings of the ACM SIGMOD, pp 61–72
Woo KG, Lee JH, Kim MH, Lee YJ (2004) FINDIT: a fast and intelligent subspace clustering algorithm using dimension voting. Inform Softw Technol 46(4):255–271
Article Google Scholar
Yip KY, Cheung DW, Ng MK (2004) A practical projected clustering algorithm. IEEE Trans Knowl Data Eng 16(11):1387–1397
Article Google Scholar
Parsons L, Haque E, Liu H (2004) Subspace clustering for high dimensional data: a review. ACM SIGKDD Explor Newsl 6(1):90–105
Article Google Scholar
Chan EY, Ching WK, Ng MK, Huang JZ (2004) An optimization algorithm for clustering using weighted dissimilarity measures. Pattern Recogn 37(5):943–952
Article MATH Google Scholar
Jing L, Ng MK, Huang JZ (2007) An entropy weighting K-means algorithm for subspace clustering of high-dimensional sparse data. IEEE Trans Knowl Data Eng 19(8):1–16
Article MATH Google Scholar
Domeniconi C, Gunopulos D, Ma S, Yan B, Al-Razgan M, Papadopoulos D (2007) Locally adaptive metrics for clustering high dimensional data. Data Min Knowl Disc 14(1):63–97
Article MathSciNet Google Scholar
Jing L, Ng MK, Xu J, Huang JZ (2005) Subspace clustering of text documents with feature weighting k-means algorithm. Adv Knowl Discov Data Mining 3518:802–812
Article Google Scholar
Gan G, Wu J, Yang Z (2006) A fuzzy subspace algorithm for clustering high dimensional data. In: Li X, Zaiane O, Li Z (eds) Lecture notes in artificial intelligence 4093. Springer, Berlin, pp 271–278
Google Scholar
Gan G, Wu J (2008) A convergence theorem for the fuzzy subspace clustering algorithm. Pattern Recogn 41:1939–1947
Article MATH Google Scholar
Deng Z, Choi KS, Chung FL, Wang S (2010) Enhanced soft subspace clustering integrating within-cluster and between-cluster information. Pattern Recogn 43:767–781
Article MATH Google Scholar
Domeniconi C, Papadopoulos D, Gunopulos D, Ma S (2004) Subspace clustering of high dimensional data, In: Proceedings of the SIAM international conference on data mining
Friedman JH, Meulman JJ (2004) Clustering objects on subsets of attributes. J R Stat Soc B 66(4):815–849
Article MathSciNet MATH Google Scholar
Frigui H, Nasraoui O (2004) Unsupervised learning of prototypes and attribute weights. Pattern Recogn 37(3):567–581
Article Google Scholar
Witten I, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn edn. Morgan Kaufmann, San Francisc
MATH Google Scholar
Asuncion A, Newman D J (2007) UCI Machine Learning Repository. School of Information and Computer Science, CA: University of California, Irvine. http://www.ics.uci.edu/~mlearn/MLRepository.html
Hulten G, Spencer L, Domingos P (2001) Mining time-changing data streams, In Proc. of ACM International Conference on Knowledge Discovery and Data Mining, ACM Press: 97-106

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China under Grant No. 61070062, the Key Project on the Cooperation of Industry and University of Fujian Province of China under Grant No. 2010H6007, and the Key Scientific Research Project of the Higher Education Institutions of Fujian Province of China under Grant No. JK2009006.

Author information

Authors and Affiliations

School of Mathematics and Computer Science, Fujian Normal University, Fuzhou, China
Gongde Guo, Si Chen & Lifei Chen
Key Laboratory of Network Security and Cryptography, Fujian Normal University, Fuzhou, China
Gongde Guo, Si Chen & Lifei Chen

Authors

Gongde Guo
View author publications
You can also search for this author in PubMed Google Scholar
Si Chen
View author publications
You can also search for this author in PubMed Google Scholar
Lifei Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gongde Guo.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Guo, G., Chen, S. & Chen, L. Soft subspace clustering with an improved feature weight self-adjustment mechanism. Int. J. Mach. Learn. & Cyber. 3, 39–49 (2012). https://doi.org/10.1007/s13042-011-0038-8

Download citation

Received: 19 April 2011
Accepted: 11 July 2011
Published: 03 August 2011
Issue Date: March 2012
DOI: https://doi.org/10.1007/s13042-011-0038-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Soft subspace clustering with an improved feature weight self-adjustment mechanism

Abstract

Access this article

Similar content being viewed by others

A Comprehensive Survey of Anomaly Detection Algorithms

K-Means algorithm based on multi-feature-induced order

SSCNet: learning-based subspace clustering

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Soft subspace clustering with an improved feature weight self-adjustment mechanism

Abstract

Access this article

Similar content being viewed by others

A Comprehensive Survey of Anomaly Detection Algorithms

K-Means algorithm based on multi-feature-induced order

SSCNet: learning-based subspace clustering

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation