Accelerated Kmeans Clustering Using Binary Random Projection

Choi, Yukyung; Park, Chaehoon; Kweon, In So

doi:10.1007/978-3-319-16808-1_18

Yukyung Choi¹⁷,
Chaehoon Park¹⁷ &
In So Kweon¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9004))

Included in the following conference series:

Asian Conference on Computer Vision

2597 Accesses
1 Citations

Abstract

Codebooks have been widely used for image retrieval and image indexing, which are the core elements of mobile visual searching. Building a vocabulary tree is carried out offline, because the clustering of a large amount of training data takes a long time. Recently proposed adaptive vocabulary trees do not require offline training, but suffer from the burden of online computation. The necessity for clustering high dimensional large data has arisen in offline and online training. In this paper, we present a novel clustering method to reduce the burden of computation without losing accuracy. Feature selection is used to reduce the computational complexity with high dimensional data, and an ensemble learning model is used to improve the efficiency with a large number of data. We demonstrate that the proposed method outperforms the-state of the art approaches in terms of computational complexity on various synthetic and real datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://personalpages.manchester.ac.uk/mbs/Julia.Handl/generators.html.

References

Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: International Conference on Computer Vision and Pattern Recognition, pp. 2161–2168 (2006)
Google Scholar
Tsai, S.S., Chen, D., Takacs, G., Chandrasekhar, V., Singh, J.P., Girod, B.: Location coding for mobile image retrieval. In: Proceedings of the 5th International ICST Mobile Multimedia Communications Conference (2009)
Google Scholar
Straub, J., Hilsenbeck, S., Schroth, G., Huitl, R., Möller, A., Steinbach, E.: Fast relocalization for visual odometry using binary features. In: IEEE International Conference on Image Processing (ICIP), Melbourne, Australia (2013)
Google Scholar
Nicosevici, T., Garcia, R.: Automatic visual bag-of-words for online robot navigation and mapping. Trans. Robot. 99, 1–13 (2012)
Google Scholar
Yeh, T., Lee, J.J., Darrell, T.: Adaptive vocabulary forests br dynamic indexing and category learning. In: Proceedings of the International Conference on Computer Vision, pp. 1–8 (2007)
Google Scholar
Kim, J., Park, C., Kweon, I.S.: Vision-based navigation with efficient scene recognition. J. Intell. Serv. Robot. 4, 191–202 (2011)
Article Google Scholar
Lloyd, S.P.: Least squares quantization in PCM. Trans. Inf. Theory 28, 129–137 (1982)
Article MATH MathSciNet Google Scholar
Elkan, C.: Using the triangle inequality to accelerate k-means. In: International Conference on Machine Learning, pp. 147–153 (2003)
Google Scholar
Bradley, P.S., Fayyad, U.M.: Refining initial points for k-means clustering. In: International Conference on Machine Learning (1998)
Google Scholar
Arthur, D., Vassilvitskii, S.: K-means++: the advantages of careful seeding. In: ACM-SIAM Symposium on Discrete Algorithms (2007)
Google Scholar
Parsons, L., Haque, E., Liu, H.: Subspace clustering for high dimensional data: a review. ACM SIGKDD Explorarions Newslett. 6, 90–105 (2004)
Article Google Scholar
Khalilian, M., Mustapha, N., Suliman, N., Mamat, A.: A novel k-means based clustering algorithm for high dimensional data sets. In: Internaional Multiconference of Engineers and Computer Scientists, pp. 17–19 (2010)
Google Scholar
Moise, G., Sander: Finding non-redundant, statistically significant regions in high dimensional data: a novel approach to projected and subspace clustering. In: International Conference on Knowledge Discovery and Data Mining (2008)
Google Scholar
Achlioptas, D.: Database-friendly random projections. In: ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 274–281 (2001)
Google Scholar
Ding, C., He, X., Zha, H., Simon, H.D.: Adaptive dimension reduction for clustering high dimensional data. In: International Conference on Data Mining, pp. 147–154 (2002)
Google Scholar
Hinneburg, A., Keim, D.A.: Optimal grid-clustering: towards breaking the curse of dimensionality in high-dimensional clustering. In: International Conference on Very Large Data Bases (1999)
Google Scholar
Bingham, E., Mannila, H.: Random projection in dimensionality reduction: applications to image and text data. In: International Conference on Knowledge Discovery and Data Mining (2001)
Google Scholar
Fred, A.L.N., Jain, A.K.: Combining multiple clusterings using evidence accumulation. Trans. Pattern Anal. Mach. Intell. 27(6), 835–850 (2005)
Article Google Scholar
Polikar, R.: Ensemble based systems in decision making. Circ. Syst. Mag. 6(3), 21–45 (2006)
Article Google Scholar
Fern, X.Z., Brodley, C.E.: Random projection for high dimensional data clustering: a cluster ensemble approach. In: International Conference on Machine Learning, pp. 186–193 (2003)
Google Scholar
Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97, 273–324 (1997)
Article MATH Google Scholar
Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: ACM Symposium on Theory of Computing, pp. 604–613 (1998)
Google Scholar
Elhamifar, E., Vidal., R.: Sparse subspace clustering. In: International Conference on Computer Vision and Pattern Recognition (2009)
Google Scholar
Elhamifar, E., Vidal, R.: Sparse manifold clustering and embedding. Neural Inf. Process. Syst. 24, 55–63 (2011)
Google Scholar
Johnson, W.B., Lindenstrauss, J.: Extensions of lipschitz mapping into hilbert space. In: International Conference in Modern Analysis and Probability, vol. 26, pp. 90–105 (1984)
Google Scholar
Krizhevsky, A.: Learning multiple layers of features from tiny images. Technical report (2009)
Google Scholar
Lai, K., Bo, L., Ren, X., Fox, D.: A large-scale hierarchical multi-view RGB-D object dataset. In: International Conference on Robotics and Automation, pp. 1817–1824 (2012)
Google Scholar
Strehl, A., Ghosh, J.: Cluster ensembles - a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583–617 (2003)
MATH MathSciNet Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: International Conference on Computer Vision and Pattern Recognition, pp. 2169–2178 (2006)
Google Scholar
Hecht-Nielsen, R.: Context vectors: general purpose approximate meaning representations self-organized from raw data. In: Zurada, J.M., Marks II, R.J., Robinson, C.J. (eds.) Computational Intelligence: Imitating Life, pp. 43–56. IEEE Press, Cambridge (1994)
Google Scholar

Download references

Acknowledgement

We would like to thank Greg Hamerly and Yudeog Han for their support. This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (No. 2010-0028680).

Author information

Authors and Affiliations

Robotics and Computer Vision Lab., KAIST, Daejeon, Korea
Yukyung Choi, Chaehoon Park & In So Kweon

Authors

Yukyung Choi
View author publications
You can also search for this author in PubMed Google Scholar
Chaehoon Park
View author publications
You can also search for this author in PubMed Google Scholar
In So Kweon
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to In So Kweon .

Editor information

Editors and Affiliations

Technische Universität München, Garching, Bayern, Germany
Daniel Cremers
University of Adelaide, Adelaide, South Australia, Australia
Ian Reid
Keio University, Yokohama, Kanagawa, Japan
Hideo Saito
University of California at Merced, Merced, California, USA
Ming-Hsuan Yang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Choi, Y., Park, C., Kweon, I.S. (2015). Accelerated Kmeans Clustering Using Binary Random Projection. In: Cremers, D., Reid, I., Saito, H., Yang, MH. (eds) Computer Vision -- ACCV 2014. ACCV 2014. Lecture Notes in Computer Science(), vol 9004. Springer, Cham. https://doi.org/10.1007/978-3-319-16808-1_18

Download citation

DOI: https://doi.org/10.1007/978-3-319-16808-1_18
Published: 16 April 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16807-4
Online ISBN: 978-3-319-16808-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics