Accelerating Infinite Ensemble of Clustering by Pivot Features

Jin, Xiao-Bo; Xie, Guo-Sen; Huang, Kaizhu; Hussain, Amir

doi:10.1007/s12559-018-9583-8

Accelerating Infinite Ensemble of Clustering by Pivot Features

Published: 27 July 2018

Volume 10, pages 1042–1050, (2018)
Cite this article

Cognitive Computation Aims and scope Submit manuscript

Xiao-Bo Jin¹,
Guo-Sen Xie^2,3,
Kaizhu Huang⁴ &
…
Amir Hussain⁵

220 Accesses
8 Citations
Explore all metrics

Abstract

The infinite ensemble clustering (IEC) incorporates both ensemble clustering and representation learning by fusing infinite basic partitions and shows appealing performance in the unsupervised context. However, it needs to solve the linear equation system with the high time complexity in proportion to O(d³) where d is the concatenated dimension of many clustering results. Inspired by the cognitive characteristic of human memory that can pay attention to the pivot features in a more compressed data space, we propose an acceleration version of IEC (AIEC) by extracting the pivot features and learning the multiple mappings to reconstruct them, where the linear equation system can be solved with the time complexity O(dr²) (r ≪ d). Experimental results on the standard datasets including image and text ones show that our algorithm AIEC improves the running time of IEC greatly but achieves the comparable clustering performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Comprehensive Survey of Clustering Algorithms

Article 01 June 2015

Density-Based Clustering Based on Hierarchical Density Estimates

Data clustering: application and trends

Article 27 November 2022

Notes

References

Bailey K. Numerical Taxonomy and cluster a. Typologies and Taxonomies. CA: SAGE Publications Ltd; 1994.
Filipovych R, Resnick SM, Davatzikos C. Semi-supervised cluster analysis of imaging data. NeuroImage 2011;54(3):2185–2197.
Article PubMed Google Scholar
Bewley A, Upcroft B. Advantages of exploiting projection structure for segmenting dense 3D point clouds. Proceedings of the 2013 Australasian Conference on Robotics and Automation, Australian Robotics & Automation Association. In: Katupitiya J, Guivant J, and Eaton R, editors. Sydney: University of New South Wales; 2013. p. 1–8.
Kim G, Xing EP. Reconstructing storyline graphs for image recommendation from web community photos. Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR ’14. Washington: IEEE Computer Society; 2014. p. 3882–3889.
Estivill-Castro V. Why so many clustering algorithms: a position paper. SIGKDD Explor Newsl 2002;4(1):65–75.
Article Google Scholar
Li X, Lu Q, Dong Y, Tao D. SCE: A manifold regularized set-covering method for data partitioning. IEEE Trans Neural Netw Learn Syst 2017;PP(99):1–14.
Google Scholar
Breiman L. Bagging predictors. Mach Learn 1996;24(2):123–140.
Google Scholar
Luo D, Ding C, Huang H, Nie F. Consensus spectral clustering in near-linear time. Proceedings of the 2011 IEEE 27th International Conference on Data Engineering, ICDE ’11. Washington: IEEE Computer Society; 2011. p. 1079–1090.
Bengio Y, Courville A, Vincent P. Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 2013;35(8):1798–1828.
Article PubMed Google Scholar
Hinton GE, Osindero S, Teh Y -W. A fast learning algorithm for deep belief nets. Neural Comput 2006; 18(7):1527–1554.
Article PubMed Google Scholar
Bengio Y, Lamblin P, Popovici D, Larochelle H. Greedy layer-wise training of deep networks. Advances in Neural Information Processing Systems 19. In: Schölkopf PB, Platt JC, and Hoffman T, editors. MIT Press; 2007. p. 153–160.
Song C, Liu F, Huang Y, Wang L, Tan T. 2013. Auto-encoder based data clustering: Springer, Berlin.
Huang P, Huang Y, Wang W, Wang L. Deep embedding network for clustering. In: 2014 22nd International Conference on Pattern Recognition; 2014. p. 1532–1537.
Liu H, Shao M, Li S, Fu Y. Infinite ensemble for image clustering. Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16. New York: ACM; 2016. p. 1745–1754.
Alelyani S, Tang J, Liu H. 2013. Feature selection for clustering: a review. In: Data Clustering: Algorithms and Applications .
Klawonn F, Keller A. Fuzzy clustering based on modified distance measures. Advances in Intelligent Data Analysis. Berlin: Springer; 1999. p. 291–301.
Jiu M, Wolf C, Garcia C, Baskurt A. Supervised learning and codebook optimization for bag-of-words models. Cognitive Comput 2012;4(4):409–419.
Article Google Scholar
Pandarachalil R, Sendhilkumar S, Mahalakshmi GS. Twitter sentiment analysis for large-scale data: an unsupervised approach. Cognitive Comput 2015;7(2):254–262.
Article Google Scholar
Jin X -B, Geng G -G, Sun M, Zhang D. Combination of multiple bipartite ranking for multipartite web content quality evaluation. Neurocomputing 2015;149:1305–1314.
Article Google Scholar
Ding S, Zhang J, Jia H, Qian J. An adaptive density data stream clustering algorithm. Cognitive Comput 2016;8(1):30–38.
Article Google Scholar
MacQueen J. 1967. Some methods for classification and analysis of multivariate observations The Regents of the University of California.
Bishop CM. Pattern recognition and machine learning. New York: Springer; 2006.
Google Scholar
De la Torre F, Kanade T. Discriminative cluster analysis. Proceedings of the 23rd International Conference on Machine Learning, ICML ’06. New York: ACM; 2006. p. 241–248.
Li X, Cui G, Dong Y. Graph regularized non-negative low-rank matrix factorization for image clustering. IEEE Trans Cybern 2017;47(11):3840–3853.
Article PubMed Google Scholar
Li X, Cui G, Dong Y. Refined-graph regularization-based nonnegative matrix factorization. ACM Trans Intell Syst Technol 2017;9(1):1:1–1:21.
Google Scholar
Fred A. Finding consistent clusters in data partitions. Multiple Classifier Systems. Berlin: Springer; 2001. p. 309–318.
Topchy A, Jain AK, Punch W. Combining multiple weak clusterings. In: Third IEEE International Conference on Data Mining; 2003. p. 331–338.
Fred ALN, Jain AK. Learning pairwise similarity for data clustering. In: 18th International Conference on Pattern Recognition (ICPR’06); 2006. Vol 1. p. 925–928.
Vega-Pons S, Ruiz-Shulcloper J. A survey of clustering ensemble algorithms. Int J Pattern Recognit Artif Intell 2011;25(03):337–372.
Article Google Scholar
Minaei-Bidgoli B, Topchy A, Punch WF. Ensembles of partitions via data resampling. In: International Conference on Information Technology: Coding and Computing, 2004. Proceedings. ITCC 2004., Vol. 2; 2004. p. 188–192.
Chen M, Xu Z, Weinberger KQ, Sha F. Marginalized denoising autoencoders for domain adaptation. Proceedings of the 29th International Coference on International Conference on Machine Learning, ICML’12. USA: Omni Press; 2012. p. 1627–1634.
Glorot X, Bordes A, Bengio Y. Domain adaptation for large-scale sentiment classification: a deep learning approach. In Proceedings of the Twenty-eight International Conference on Machine learning, ICML; 2011.
Bingham E, Mannila H. Random projection in dimensionality reduction: applications to image and text data. San Francisco: ACM Press; 2001, pp. 245–250.
Google Scholar
Achlioptas D. Database-friendly random projections: Johnson-Lindenstrauss with binary coins. J Comput Syst Sci 2003;66(4):671–687.
Article Google Scholar
Li P, Hastie TJ, Church KW. Very sparse random projections. Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’06. New York: ACM; 2006. p. 287–296.
Bache K, Lichman M. UCI Repository of machine learning databases, Ph.D. thesis, University of California. Irvine: School of Information and Computer Sciences; 1998.
Google Scholar
Lecun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proc IEEE 1998;86(11):2278–2324.
Article Google Scholar
Samaria FS, Harter AC. Parameterisation of a stochastic model for human face identification. In: IEEE Workshop on Applications of Computer Vision; 1994. p. 138–142.
Everingham M, Van Gool L, Williams CKI, Winn J, Zisserman A. 2007. The PASCAL Visual Object Classes Challenge 2007 (VOC2007) results.
Lang K. NewsWeeder: learning to filter netnews. In: ICML; 1995. p. 331–339.
Strehl A, Strehl E, Ghosh J, Mooney R. Impact of similarity measures on web-page clustering, in: In Workshop on Artificial Intelligence for Web Search (AAAI 2000, AAAI; 2000. p. 58–64.
Brun M, Sima C, Hua J, Lowey J, Carroll B, Suh E, Dougherty ER. Model-based evaluation of clustering validation measures. Pattern Recogn. 2007;40(3):807–824.
Article Google Scholar

Download references

Funding

This work was partially supported by the Fundamental Research Funds for the Henan Provincial Colleges and Universities in the Henan University of Technology (2016RCJH06), the National Key Research & Development Program 418 (2016YFD0400104-5), the National Basic Research Program of China (2012CB316301), the National Natural Science Foundation of China (61103138 and 61473236).

Author information

Authors and Affiliations

College of Information Science and Engineering, Henan University of Technology, Zhengzhou, China
Xiao-Bo Jin
Inception Institute of Artificial Intelligence (IIAI), Abu Dhabi, UAE
Guo-Sen Xie
College of Information Science and Engineering, Henan University of Science and Technology, Luoyang, China
Guo-Sen Xie
Department of Electrical & Electronic Engineering, Xi’an Jiaotong-Liverpool University, Suzhou, China
Kaizhu Huang
Division of Computing Science & Maths, School of Natural Sciences, University of Stirling, Stirling, FK9 4LA, UK
Amir Hussain

Authors

Xiao-Bo Jin
View author publications
You can also search for this author in PubMed Google Scholar
Guo-Sen Xie
View author publications
You can also search for this author in PubMed Google Scholar
Kaizhu Huang
View author publications
You can also search for this author in PubMed Google Scholar
Amir Hussain
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiao-Bo Jin.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants performed by any of the authors.

Informed Consent

Informed consent was obtained from all individual participants included in the study.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jin, XB., Xie, GS., Huang, K. et al. Accelerating Infinite Ensemble of Clustering by Pivot Features. Cogn Comput 10, 1042–1050 (2018). https://doi.org/10.1007/s12559-018-9583-8

Download citation

Received: 23 June 2017
Accepted: 17 July 2018
Published: 27 July 2018
Issue Date: December 2018
DOI: https://doi.org/10.1007/s12559-018-9583-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Accelerating Infinite Ensemble of Clustering by Pivot Features

Abstract

Access this article

Similar content being viewed by others

A Comprehensive Survey of Clustering Algorithms

Density-Based Clustering Based on Hierarchical Density Estimates

Data clustering: application and trends

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Informed Consent

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Accelerating Infinite Ensemble of Clustering by Pivot Features

Abstract

Access this article

Similar content being viewed by others

A Comprehensive Survey of Clustering Algorithms

Density-Based Clustering Based on Hierarchical Density Estimates

Data clustering: application and trends

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Informed Consent

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation