Skip to main content
Log in

Landmark recognition with compact BoW histogram and ensemble ELM

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Along with the rapid development of mobile terminal devices, landmark recognition applications based on mobile devices have been widely researched in recent years. Due to the fast response time requirement of mobile users, an accurate and efficient landmark recognition system is thus urgent for mobile applications. In this paper, we propose a landmark recognition framework by employing a novel discriminative feature selection method and the improved extreme learning machine (ELM) algorithm. The scalable vocabulary tree (SVT) is first used to generate a set of preliminary codewords for landmark images. An efficient codebook learning algorithm derived from the word mutual information and Visual Rank technique is proposed to filter out those unimportant codewords. Then, the selected visual words, as the codebook for image encoding, are used to produce a compact Bag-of-Words (BoW) histogram. The fast ELM algorithm and the ensemble approach using the ELM classifier are utilized for landmark recognition. Experiments on the Nanyang Technological University campus’s landmark database and the Fifteen Scene database are conducted to illustrate the advantages of the proposed framework.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. http://www.statisticbrain.com/mobile-phone-app-store-statistics/

References

  1. Arai K, Barakbah AR (2007) Hierarchical K-means: an algorithm for centroids initialization for K-means. Rep Fac Sci Engrg 36:25–31

    Google Scholar 

  2. Bhattacharya P, Gavrilova M (2013) A survey of landmark recognition using the bag-of-words framework. Intelligent Computer Graphics 2012, Studies in Computational Intelligence 441:243–263

    Article  Google Scholar 

  3. Bobek S, Nalepa GJ, Ligȩza A, Adrian WT, Kaczor K (2014) Mobile context-based framework for threat monitoring in urban environment with social threat monitor. Multimed Tools Appl, in press. doi:10.1007/s11042-014-2060-9

  4. Bosch A, Zisserman A, Munoz X (2008) Scene classification using a hybrid generative/discriminative approach. IEEE Trans. Pattern Anal. Mach. Intell. 4(3):712–726

    Article  Google Scholar 

  5. Cao J, Lin Z, Huang G-B (2010) Composite function wavelet neural networks with extreme learning machine. Neurocomputing 73:1405–1416

    Article  Google Scholar 

  6. Cao J, Lin Z, Huang G-B, Liu N (2012) Voting based extreme learning machine. Inf Sci 185:66–77

    Article  MathSciNet  Google Scholar 

  7. Cao J, Lin Z, Huang G-B (2013) Voting base online sequential extreme learning machine for multi-class classification. Proc IEEE Int Symp Circ Syst:2327–2330

  8. Cao J, Chen T, Fan J (2014) Recognition based on BoW Framework. In: Proc. of 2014 IEEE Conf. Indust. Elect. Applica., pp 1163–1168, Hangzhou, China

  9. Chen T, Yap K-H (2014) Discriminative BoW framework for mobile landmark recognition. IEEE Trans Cybern 44(3):695–706

    Article  Google Scholar 

  10. Chen T, Wu K, Yap K-H, Li Z, Tsai FS (2009) A survey on mobile landmark recognition for information retrieval. In: International Conferences Mobile Data Management: Systems, Services and Middleware, pp 625–630

  11. Chen T, Yap K-H, Chau L-P (2011) Integrated content and context analysis for mobile landmark recognition. IEEE Trans Circuits Syst Video Technol 21(10):41476–1486

    Article  Google Scholar 

  12. Cheng C, Page D, Abidi L (2008) Object-based place recognition and loop closing with jigsaw puzzle image segmentation algorithm. In: Proceedings IEEE Conferences Robotics and Automation, Pasadena, CA, pp 557–562

  13. Chin T, Goh H, Lim J (2008) Boosting descriptors condensed from video sequences for place recognition. In: Proceedings Computers Vision and Pattern Recognition Workshop on Visual Localization for Mobile Platforms

  14. Csurka G, Dance C, Fan L, Willamowski J, Bray C (2004) Visual categorization with bags of keypoints. In: Proceedings International Workshop Statistics Learning Computing Vision

  15. Fergus R, Perona P, Zisserman A (2003) Object class recognition by unsupervised scale-invariant learning. Proc IEEE Int Conf Comput Vis Pattern Recognit 2:264–271

    Google Scholar 

  16. Fritz G, Seifert C, Paletta L (2006) A mobile vision system for urban detection with informative local descriptors. Proc IEEE Int Conf Comput Vis Syst 30–35

  17. Ge Y, Yu J (2008) A scene recognition algorithm based on covariance descriptor. In: Proc. IEEE Conf. Cybernetics and Systems, pp 838–842

  18. Han J, Xu M, Li X, Guo L, Liu T (2014) Interactive object-based image retrieval and annotation on iPad. Multimed Tools Appl 72:2275–2297

    Article  Google Scholar 

  19. Huang G-B, Zhu Q-Y, Siew C-K (2006) Extreme learning machine: theory and applications. Neurocomputing 70:489–501

    Article  Google Scholar 

  20. Huang G-B, Zhou H, Ding X, Zhang R (2012) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern B Cybern 42:513–529

    Article  Google Scholar 

  21. Hsu C-W, Lin C-J (2002) A comparison of methods for multiclass support vector machines. IEEE Trans Neural Netw 13(2):415–425

    Article  Google Scholar 

  22. Lan Y, Soh YC, Huang G-B (2009) Ensemble of online sequential extreme learning machine. Neurocomputing 72(13–15):3391–3395

    Article  Google Scholar 

  23. Lee Y, Kim C, Kim Y, Whangbo T (2013) Facial landmarks detection using improved active shape model on android platform. Multimed Tools Appl, in press. doi:10.1007/s11042-013-1565-y

  24. Li F-F, Perona P (2005) A Bayesian hierarchical model for learning natural scene categories. In: Proceedings IEEE Conference Computer Vision and Pattern Recognition, pp 524–531

  25. Li Y, Lim J (2007) Outdoor place recognition using compact local descriptors and multiple queries with user verification, In: Proceedings 15th International Conference Multiple, Augsburg, Germany, pp 549–552

  26. Li Y, Lim JH, Goh H (2008) Cascaded classification with optimal candidate selection for effective place recognition. Proc IEEE Conf Multimedia 1493–1496

  27. Lim J, Li Y, You Y (2007) Scene recognition with camera phones for tourist information access. Proc Int Conf Multimedia 100–103

  28. Lin SB, Liu X, Fang J, Xu ZB Is extreme learning machine feasible? A theoretical assessment (Part II), arXiv: 1401.6240v1 [cs.LG], 24 Jan. 2014

  29. Linde O, Lindeberg T (2004) Object recognition using composed receptive field histograms of higher dimensionality. In: Proc. IEEE Conf. Image process. Pattern Recognit.

  30. Liu N, Wang H (2010) Ensemble based extreme learning machine. IEEE Signal Process Lett 17 (8):754–757

    Article  Google Scholar 

  31. Lowe D (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vision 60:91–110

    Article  Google Scholar 

  32. Lu L, Toyama K, Hagar GD (2005) A two level approach for scene recognition. Proc IEEE Int Conf Comput Vision Pattern Recognit 1:688–695

    Google Scholar 

  33. Lu H, An C, Zheng E, Lu Y (2014) Dissimilarity based ensemble of extreme learning machine for gene expression data classification. Neurocomputing 128:22–30

    Article  Google Scholar 

  34. MacQueen JB (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings 5th Berkeley Symp. on mathematical statistics and probability, pp 281–297

  35. Nister D, Stewenius H (2006) Scalable recognition with a vocabulary tree. Proc IEEE Int Conf Comput Vis Pattern Recognit 2:2161–2168

    Google Scholar 

  36. Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int J Comput Vis 42(3):145–175

    Article  MATH  Google Scholar 

  37. Page L, Brin S, Motwani R, Winograd T (1999) The PageRank citation ranking: bringing order to the web, Technical Report, Stanford InfoLab. http://ilpubs.stanford.edu:8090/422/

  38. Parikh D, Zitnick CL, Chen T (2008) Determining patch saliency using low-level context. Proc 10th Eur Conf Comput Vision 446–459

  39. Pessemier TD, Dooms S, Martens L (2014) Context-aware recommendations through context and activity recognition in a mobile environment. Multimed Tools Appl 72:2925–2948

    Article  Google Scholar 

  40. Pinz A, Fussenegger A, Auer M (2006) Generic object recognition with boosting. IEEE Trans Pattern Anal Mach Intell 28:416–431

    Article  MATH  Google Scholar 

  41. Pluim JP, Mainta JB, Viergever MA (2003) Mutual-information-based registration of medical images: a survey. IEEE Trans Med Imaging 22:986–1004

    Article  Google Scholar 

  42. Pronobis S, Caputo A (2007) Confidence-based cue integration for visual place recognition. In: Proceedings IEEE Conference Robots, Intelligent systems, San Diego, CA, pp 2394–2401

  43. Pronobis A, Caputo B, Jensfelt P, Christensen HI (2006) A discriminative approach to robust visual place recognition. In: Proceedings IEEE Conference Intelligent Robots Systems

  44. Scalzo F, Piater JH (2007) Adaptive Patch Features for Object Class Recognition with Learned Hierarchical Models. Proc IEEE Int Conf Comput Vision and Pattern Recognit 1-8

  45. Sivic J, Zisserman A (2003) Video google: a text retrieval approach to object matching in videos. Proc IEEE Int Conf Comput Vis 1470–1478

  46. Torralba A, Murphy KP, Freeman WT, Rubin MA (2003) Context-based vision system for place and object recognition. Proc IEEE Int Conf Comput Vis 273–280

  47. Yap K-H, Chen T, Li Z, Wu K (2010) A comparative study of mobilebased landmark recognition techniques. IEEE Intell Syst 25(1):48–57

    Article  Google Scholar 

  48. Yeh T, Tollmar K, Darrell T (2004) Searching the web with mobile images for location recognition. In: Proc. IEEE Conf. Comput. Vision and Pattern Recognit., pp 76–81

  49. Yin C, Zhong S, Chen W (2012) Design of sliding mode controller for a class of fractional-order chaotic systems. Commun Nonlinear Sci Numer Simulat 17:356–366

    Article  MathSciNet  MATH  Google Scholar 

  50. Yin C, Dadras S, Zhong S, Chen Y (2013) Control of a novel class of fractional-order chaotic systems via adaptive sliding mode control approach. Appl Math Modell 37(4):2469–2483

    Article  MathSciNet  Google Scholar 

  51. Yin C, Chen Y, Zhong S (2014) Fractional-order sliding mode based extremum seeking control of a class of nonlinear systems. Automatica 50:3173–3181

    Article  MathSciNet  MATH  Google Scholar 

  52. Zamir AR, Shah M (2010) Accurate image localization based on Google Maps street view. Proc Eur Conf Comput Vision 6314:255–268

    Google Scholar 

  53. Zhang SL, Tian Q, Hua G, Huang Q, Li S (2009) Descriptive visual words and visual phrases for image applications. Proc ACM Inter Conf Multimedia 75–84

  54. Zhang SL, Tian Q, Hua G, Huang Q, Li S, Gao W (2011) Generating descriptive visual words and visual phrases for large-scale image applications. IEEE Trans Image Processing 20(9):2664–2677

    Article  MathSciNet  Google Scholar 

  55. Zheng Y-T, Zhao M, Song Y, Adam H, Buddemeier U, Bissacco A, Brucher F, Chua T-S, Neven H (2009) Tour the world: Building a webscale landmark recognition engine. Proc IEEE Int Conf Comput Vis Pattern Recognit 1085–1092

Download references

Acknowledgements

This work was supported by the National Natural Science Major Foundation of Research Instrumentation of P. R. China under Grants 61427808, the Key Foundation of P. R. China under Grants 61333009, and in part by the National Key Basic Research Program of P. R. China under Grants 2012CB821204. We would like to thank the reviewers and the Editor for their constructive comments and suggestions on improving our paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiuwen Cao.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cao, J., Chen, T. & Fan, J. Landmark recognition with compact BoW histogram and ensemble ELM. Multimed Tools Appl 75, 2839–2857 (2016). https://doi.org/10.1007/s11042-014-2424-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-014-2424-1

Keywords

Navigation