Skip to main content

A New Algorithm for Discriminative Clustering and Its Maximum Entropy Extension

  • Conference paper
  • First Online:
Intelligence Science and Big Data Engineering. Big Data and Machine Learning Techniques (IScIDE 2015)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9243))

  • 2736 Accesses

Abstract

Discriminative clustering (DC) can effectively integrates subspace selection and clustering into a coherent framework. It performs in the iterative classical Linear Discriminant Analysis (LDA) dimensionality reduction and clustering processing. DC can effectively cluster the data with high dimension. However, it has complex form and high computational complexity. Recent work shows DC is equivalent to kernel k-means (KM) with a specific kernel matrix. This new insights provides a chance of simplifying the optimization problem in the original DC algorithm. Based on this equivalence relationship, Discriminative K-means (DKM) algorithm is proposed. When the number of data points (denoted as n) is small, DKM is feasible and efficient. However, the construction of kernel matrix needs to compute the inverse of a matrix in DKM, when n is large, which is time consuming. In this paper, we concentrate on the efficiency of DC. We present a new framework for DC, namely, Efficient DC (EDC), which consists of DKM and the whitening transformation of the regularized total scatter matrix (WRTS) plus KM clustering (WRTS+KM). When m (dimensions) is small and n far outweighs m, namely, n ≫ m, EDC can carry out WRTS+KM on data, which is more efficient than DKM. When n is small and m far outweighs n, namely, m ≫ n, EDC can carry out DKM on data, which is more efficient. We also extend EDC to soft case, and propose Efficient Discriminative Maximum Entropy Clustering (EDMEC), which is an efficient version of maximum entropy based DC. Extensive experiments on a collection of benchmark data sets are presented to show the effectiveness of the proposed algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Webb, A.: Statistical Pattern Recognition. Wiley, New Jersey (2002)

    Book  MATH  Google Scholar 

  2. Tan, P.N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Addison-Wesley, Boston (2005)

    Google Scholar 

  3. Yeung, K.Y., Medvedovic, M., Bumgarner, R.E.: Clustering gene-expression data with repeated measurements. Genome Biol. 4(5), R34 (2003)

    Article  Google Scholar 

  4. Mitra, P., Murthy, C., Pal, S.: Unsupervised feature selection using feature similarity. IEEE Trans. Pattern Anal. Mach. Intell. 24(3), 301–312 (2002)

    Article  Google Scholar 

  5. Law, M.H.C., Figueiredo, M.A.T., Jain, A.K.: Simultaneous feature selection and clustering using mixture models. IEEE Trans. Pattern Anal. Mach. Intell. 26(9), 1154–1166 (2004)

    Article  Google Scholar 

  6. De la Torre, F., Kanade, T.: Discriminative cluster analysis. In: Proceedings of the 23th International Conference on Machine Learning, pp. 241–248. ACM Press, New York (2006)

    Google Scholar 

  7. Ding, C., Tao, L.: Adaptive dimension reduction using discriminant analysis and K-means clustering. In: Proceedings of the 24th International Conference on Machine Learning, pp. 521–528. ACM Press, New York (2007)

    Google Scholar 

  8. Ye, J.P., Zhao, Z., Wu, M.R.: Discriminative K-means for clustering. In: 21th Advances in Neural Information Processing Systems, pp. 1649–1656. MIT Press, USA (2007)

    Google Scholar 

  9. Li, C.H., Kuo, B.C., Lin, C.T.: LDA-based clustering algorithm and its application to an unsupervised feature extraction. IEEE Trans. Fuzzy Syst. 19(1), 152–163 (2011)

    Article  Google Scholar 

  10. Yin, X.S., Chen, S.C., Hu, E.L.: Regularized soft K-means for discriminant analysis. Neurocomputing 103, 29–42 (2013)

    Article  Google Scholar 

  11. Zhi, X.B., Fan, J.L., Zhao, F.: Fuzzy linear discriminant analysis-guided maximum entropy fuzzy clustering algorithm. Pattern Recogn. 46(6), 1604–1615 (2013)

    Article  MATH  Google Scholar 

  12. Fukunaga, K.: Statistical Pattern Recognition, 2nd edn. Academic Press, San Diego (1990)

    MATH  Google Scholar 

  13. Belhumeur, P.N., Hespanha, J.P., Kriegman, D.J.: Eigenfaces vs. Fisherfaces: recognition using class specific linear projection. IEEE Trans. Pattern Anal. Mach. Intell. 19(7), 711–720 (1997)

    Article  Google Scholar 

  14. Friedman, J.H.: Regularized discriminant analysis. J. Am. Stat. Assoc. 84(405), 165–175 (1989)

    Article  MathSciNet  Google Scholar 

  15. Raudys, S., Duin, R.P.W.: On expected classification error of the fisher linear classifier with pseudo-inverse covariance matrix. Pattern Recogn. Lett. 19(5–6), 385–392 (1998)

    Article  MATH  Google Scholar 

  16. Ye, J., Xiong, T.: Computational and theoretical analysis of null space and orthogonal linear discriminant analysis. J. Mach. Learn. Res. 7, 1183–1204 (2006)

    MathSciNet  MATH  Google Scholar 

  17. Li, R.P., Mukaidono, M.: A maximum entropy approach to fuzzy clustering. In: Proceedings of the Fourth International Conference on Fuzzy Systems, pp. 2227–2232 (1995)

    Google Scholar 

  18. Karayiannis, N.B.: MECA: maximum entropy clustering algorithm. In: Proceedings of the Third IEEE International Conference on Fuzzy Systems, pp. 630–635 (1994)

    Google Scholar 

  19. Zhang, Z.H., Zheng, N.N., Shi, G.: Maximum-entropy clustering algorithm and its global convergence analysis. Sci. China Ser. E: Technol. Sci. 44(1), 89–101 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  20. Ren, S.J., Wang, Y.D.: A proof of the convergence theorem of maximum-entropy clustering algorithm. Sci. China Ser. F Inf. Sci. 53(6), 1151–1158 (2010)

    Article  MathSciNet  Google Scholar 

  21. Newman, D.J., Hettich, S., Blake, C.L., Merz, C.J.: UCI repository of machine learning databases. http://www.ics.uci.edu/mlearn/MLRepository.html (2015)

  22. Breitenbach, M., Grudic, G.: Clustering through ranking on manifolds. In: Proceedings of the 22nd International Conference on Machine Learning, pp. 73–80. ACM Press, New York (2005)

    Google Scholar 

Download references

Acknowledgements

This work is supported by the National Science Foundation of China (No. 61102095), the Science Plan Foundation of the Education Bureau of Shaanxi Province (No. 2010JK835, No. 14JK1661), the Natural Science Basic Research Plan in Shaanxi Province of China (No. 2014JM8307) and The Science and Technology Plan in Shaanxi Province of China (No. 2014KJXX-72).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiao-bin Zhi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Zhi, Xb., Fan, Jl. (2015). A New Algorithm for Discriminative Clustering and Its Maximum Entropy Extension. In: He, X., et al. Intelligence Science and Big Data Engineering. Big Data and Machine Learning Techniques. IScIDE 2015. Lecture Notes in Computer Science(), vol 9243. Springer, Cham. https://doi.org/10.1007/978-3-319-23862-3_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-23862-3_42

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-23861-6

  • Online ISBN: 978-3-319-23862-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics