A New Algorithm for Discriminative Clustering and Its Maximum Entropy Extension

Zhi, Xiao-bin; Fan, Jiu-lun

doi:10.1007/978-3-319-23862-3_42

Xiao-bin Zhi²¹ &
Jiu-lun Fan²²

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9243))

Included in the following conference series:

International Conference on Intelligent Science and Big Data Engineering

2736 Accesses

Abstract

Discriminative clustering (DC) can effectively integrates subspace selection and clustering into a coherent framework. It performs in the iterative classical Linear Discriminant Analysis (LDA) dimensionality reduction and clustering processing. DC can effectively cluster the data with high dimension. However, it has complex form and high computational complexity. Recent work shows DC is equivalent to kernel k-means (KM) with a specific kernel matrix. This new insights provides a chance of simplifying the optimization problem in the original DC algorithm. Based on this equivalence relationship, Discriminative K-means (DKM) algorithm is proposed. When the number of data points (denoted as n) is small, DKM is feasible and efficient. However, the construction of kernel matrix needs to compute the inverse of a matrix in DKM, when n is large, which is time consuming. In this paper, we concentrate on the efficiency of DC. We present a new framework for DC, namely, Efficient DC (EDC), which consists of DKM and the whitening transformation of the regularized total scatter matrix (WRTS) plus KM clustering (WRTS+KM). When m (dimensions) is small and n far outweighs m, namely, n ≫ m, EDC can carry out WRTS+KM on data, which is more efficient than DKM. When n is small and m far outweighs n, namely, m ≫ n, EDC can carry out DKM on data, which is more efficient. We also extend EDC to soft case, and propose Efficient Discriminative Maximum Entropy Clustering (EDMEC), which is an efficient version of maximum entropy based DC. Extensive experiments on a collection of benchmark data sets are presented to show the effectiveness of the proposed algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Webb, A.: Statistical Pattern Recognition. Wiley, New Jersey (2002)
Book MATH Google Scholar
Tan, P.N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Addison-Wesley, Boston (2005)
Google Scholar
Yeung, K.Y., Medvedovic, M., Bumgarner, R.E.: Clustering gene-expression data with repeated measurements. Genome Biol. 4(5), R34 (2003)
Article Google Scholar
Mitra, P., Murthy, C., Pal, S.: Unsupervised feature selection using feature similarity. IEEE Trans. Pattern Anal. Mach. Intell. 24(3), 301–312 (2002)
Article Google Scholar
Law, M.H.C., Figueiredo, M.A.T., Jain, A.K.: Simultaneous feature selection and clustering using mixture models. IEEE Trans. Pattern Anal. Mach. Intell. 26(9), 1154–1166 (2004)
Article Google Scholar
De la Torre, F., Kanade, T.: Discriminative cluster analysis. In: Proceedings of the 23th International Conference on Machine Learning, pp. 241–248. ACM Press, New York (2006)
Google Scholar
Ding, C., Tao, L.: Adaptive dimension reduction using discriminant analysis and K-means clustering. In: Proceedings of the 24th International Conference on Machine Learning, pp. 521–528. ACM Press, New York (2007)
Google Scholar
Ye, J.P., Zhao, Z., Wu, M.R.: Discriminative K-means for clustering. In: 21th Advances in Neural Information Processing Systems, pp. 1649–1656. MIT Press, USA (2007)
Google Scholar
Li, C.H., Kuo, B.C., Lin, C.T.: LDA-based clustering algorithm and its application to an unsupervised feature extraction. IEEE Trans. Fuzzy Syst. 19(1), 152–163 (2011)
Article Google Scholar
Yin, X.S., Chen, S.C., Hu, E.L.: Regularized soft K-means for discriminant analysis. Neurocomputing 103, 29–42 (2013)
Article Google Scholar
Zhi, X.B., Fan, J.L., Zhao, F.: Fuzzy linear discriminant analysis-guided maximum entropy fuzzy clustering algorithm. Pattern Recogn. 46(6), 1604–1615 (2013)
Article MATH Google Scholar
Fukunaga, K.: Statistical Pattern Recognition, 2nd edn. Academic Press, San Diego (1990)
MATH Google Scholar
Belhumeur, P.N., Hespanha, J.P., Kriegman, D.J.: Eigenfaces vs. Fisherfaces: recognition using class specific linear projection. IEEE Trans. Pattern Anal. Mach. Intell. 19(7), 711–720 (1997)
Article Google Scholar
Friedman, J.H.: Regularized discriminant analysis. J. Am. Stat. Assoc. 84(405), 165–175 (1989)
Article MathSciNet Google Scholar
Raudys, S., Duin, R.P.W.: On expected classification error of the fisher linear classifier with pseudo-inverse covariance matrix. Pattern Recogn. Lett. 19(5–6), 385–392 (1998)
Article MATH Google Scholar
Ye, J., Xiong, T.: Computational and theoretical analysis of null space and orthogonal linear discriminant analysis. J. Mach. Learn. Res. 7, 1183–1204 (2006)
MathSciNet MATH Google Scholar
Li, R.P., Mukaidono, M.: A maximum entropy approach to fuzzy clustering. In: Proceedings of the Fourth International Conference on Fuzzy Systems, pp. 2227–2232 (1995)
Google Scholar
Karayiannis, N.B.: MECA: maximum entropy clustering algorithm. In: Proceedings of the Third IEEE International Conference on Fuzzy Systems, pp. 630–635 (1994)
Google Scholar
Zhang, Z.H., Zheng, N.N., Shi, G.: Maximum-entropy clustering algorithm and its global convergence analysis. Sci. China Ser. E: Technol. Sci. 44(1), 89–101 (2001)
Article MathSciNet MATH Google Scholar
Ren, S.J., Wang, Y.D.: A proof of the convergence theorem of maximum-entropy clustering algorithm. Sci. China Ser. F Inf. Sci. 53(6), 1151–1158 (2010)
Article MathSciNet Google Scholar
Newman, D.J., Hettich, S., Blake, C.L., Merz, C.J.: UCI repository of machine learning databases. http://www.ics.uci.edu/mlearn/MLRepository.html (2015)
Breitenbach, M., Grudic, G.: Clustering through ranking on manifolds. In: Proceedings of the 22nd International Conference on Machine Learning, pp. 73–80. ACM Press, New York (2005)
Google Scholar

Download references

Acknowledgements

This work is supported by the National Science Foundation of China (No. 61102095), the Science Plan Foundation of the Education Bureau of Shaanxi Province (No. 2010JK835, No. 14JK1661), the Natural Science Basic Research Plan in Shaanxi Province of China (No. 2014JM8307) and The Science and Technology Plan in Shaanxi Province of China (No. 2014KJXX-72).

Author information

Authors and Affiliations

School of Science, Xi’an University of Posts and Telecommunications, Xi’an, 710121, China
Xiao-bin Zhi
School of Communication and Information Engineering, Xi’an University of Posts and Telecommunications, Xi’an, 710121, China
Jiu-lun Fan

Authors

Xiao-bin Zhi
View author publications
You can also search for this author in PubMed Google Scholar
Jiu-lun Fan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiao-bin Zhi .

Editor information

Editors and Affiliations

Zhejiang University, Hangzhou, China
Xiaofei He
Xidian University, Xi'an, China
Xinbo Gao
Northwestern Polytechnical University, Xi'an, China
Yanning Zhang
Nanjing University, Nanjing, China
Zhi-Hua Zhou
Chinese Academy of Sciences, Beijing, China
Zhi-Yong Liu
Suzhou University of Science and Technology, Suzhou, China
Baochuan Fu
Suzhou University of Science and Technology, Jiangsu, China
Fuyuan Hu
Suzhou University of Science and Technology, Jiangsu, China
Zhancheng Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhi, Xb., Fan, Jl. (2015). A New Algorithm for Discriminative Clustering and Its Maximum Entropy Extension. In: He, X., et al. Intelligence Science and Big Data Engineering. Big Data and Machine Learning Techniques. IScIDE 2015. Lecture Notes in Computer Science(), vol 9243. Springer, Cham. https://doi.org/10.1007/978-3-319-23862-3_42

Download citation

DOI: https://doi.org/10.1007/978-3-319-23862-3_42
Published: 17 October 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23861-6
Online ISBN: 978-3-319-23862-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics