Scene recognition with bag of visual nouns and prepositions

Stalbaum, John; Chae, Hee-Won; Song, Jae-Bok

doi:10.1007/s11370-015-0167-0

Scene recognition with bag of visual nouns and prepositions

Original Research Paper
Published: 24 March 2015

Volume 8, pages 115–125, (2015)
Cite this article

Intelligent Service Robotics Aims and scope Submit manuscript

John Stalbaum¹,
Hee-Won Chae¹ &
Jae-Bok Song¹

363 Accesses
Explore all metrics

Abstract

The loop closure problem is central to topological simultaneous localization and mapping (SLAM); by associating features between distant portions of a trajectory, the odometry error that has accumulated between two observations can be eliminated and a more consistent map can be built. Bayesian pattern recognition techniques such as bag of visual words (BoVW) have recently shown outstanding results in solving the loop closure problem completely in image space using very simple, inexpensive cameras, without the requirement for highly accurate metric information, 3D reconstruction, or camera calibration. In this paper, a modified BoVW descriptor that incorporates simple geometric relationships within an image is used with the fast appearance-based mapping (FAB-MAP) algorithm. In direct comparisons with the traditional BoVW descriptor, an improved recall rate is observed with an acceptable increase in computational time. The proposal of a BoVW-compatible descriptor and the use of the proposed descriptor with a well-known BoVW classifier demonstrate the ability of the BoVW metaphor to be generalized, which could pave the way for more various BoVW descriptors in the same way that many individual visual feature descriptors exist within the computer vision community.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

BoVW-CAM: Visual Explanation from Bag of Visual Words

Few-Shot Object Detection by Knowledge Distillation Using Bag-of-Visual-Words Representations

SeqSLAM with Bag of Visual Words for Appearance Based Loop Closure Detection

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

References

Thrun S (2001) Probabilistic robotics. MIT Press, Cambridge
Google Scholar
Bradski G (2000) OpenCV. Dr Dobb’s J Softw Tools
Fischler M, Bolles R (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 24(6):381–395
Article MathSciNet Google Scholar
Sivic J, Zisserman A (2003) Video Google: a text retrieval approach to object matching in videos. Int Conf Comput Vis 2:1470–1477. doi:10.1109/ICCV.2003.1238663
Google Scholar
Cummins M, Newman P (2008) FAB-MAP: probabilistic localization and mapping in the space of appearance. Int J Robot Res 27(6):647–665. doi:10.1177/0278364908090961
Article Google Scholar
Cummins M, Newman P (2010) Appearance-only SLAM at large scale with FAB-MAP 2.0. Int J Robot Res 30(9):1100–1123. doi:10.1177/0278364910385483
Article Google Scholar
Pérez J, Caballero F, Merino L (2015) Enhanced Monte Carlo localization with visual place recognition for robust robot localization. J Intell Robot Syst 1–16. doi:10.1007/s10846-015-0198-y
Yang C, Shengnan C, Jingdong W, Quan L (2014) Low-rank sift: an affine invariant feature for place recognition. Comput Res Repos 1–5. arXiv:1408.1688
Sünderhauf N, Dayoub F, Shirazi S, Upcroft B, Milford M (2015) On the performance of ConvNet features for place recognition. Comput Res Repos 1–8. arXiv:1501.04158
Cao J, Chen T, Fan J (2014) Fast online learning algorithm for landmark recognition based on BoW framework. IEEE Trans Ind Appl 1163–1168. doi:10.1109/ICIEA.2014.6931341
Johns E, Yang G (2014) Pairwise probabilistic voting: fast place recognition without RANSAC. Comput Vis ECCV 505–519. doi:10.1007/978-3-319-10605-2_33
Bolovinou A, Pratikakis I, Perantonis S (2012) Bag of spatio-visual words for context inference in scene classification. Pattern Recognit 46(3):1039–1053. doi:10.1016/j.patcog.2012.07.024
Article Google Scholar
Duda R, Hart P, Stork D (2000) Pattern classification. Wiley, New York
Google Scholar
Bay H, Tuytelaars T, Van Gool L (2006) SURF: speeded up robust features. Comput Vis ECCV 404–417. doi:10.1007/11744023_32
Lowe D (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 602(2):91–110. doi:10.1023/B:VISI.0000029664.99615.94
Article Google Scholar
Rublee E, Rabaud V (2011) ORB: an efficient alternative to SIFT or SURF. Comput Vis ECCV 2564–2571. doi:10.1109/ICCV.2011.6126544
Calonder M, Lepetit V, Strecha C, Fua P (2010) Brief: binary robust independent elementary features. Comput Vis ECCV IV:778–792. doi:10.1007/978-3-642-15561-1_56
Cormen T, Leiserson C, Rivest R, Stein C (2001) Introduction to algorithms, 2nd edn. MIT Press, Cambridge
Chow C, Lee C (1968) Approximating discrete probability distributions with dependence trees. IEEE Trans Inf Theory 14(3):462–467. doi:10.1109/TIT.1968.1054142
Article MATH Google Scholar

Download references

Acknowledgments

This research was supported by the MOTIE under the Industrial Foundation Technology Development Program supervised by KEIT (No. 10051155) and by Basic Science Research Program through the NRF funded by MSIP (No. 2007-0056094).

Author information

Authors and Affiliations

Department of Mechanical Engineering, Korea University, 5, Anam-dong, Sungbuk-ku, Seoul, 136-713, South Korea
John Stalbaum, Hee-Won Chae & Jae-Bok Song

Authors

John Stalbaum
View author publications
Search author on:PubMed Google Scholar
Hee-Won Chae
View author publications
Search author on:PubMed Google Scholar
Jae-Bok Song
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Jae-Bok Song.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Stalbaum, J., Chae, HW. & Song, JB. Scene recognition with bag of visual nouns and prepositions. Intel Serv Robotics 8, 115–125 (2015). https://doi.org/10.1007/s11370-015-0167-0

Download citation

Received: 16 February 2015
Accepted: 08 March 2015
Published: 24 March 2015
Issue Date: April 2015
DOI: https://doi.org/10.1007/s11370-015-0167-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Scene recognition with bag of visual nouns and prepositions

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

BoVW-CAM: Visual Explanation from Bag of Visual Words

Few-Shot Object Detection by Knowledge Distillation Using Bag-of-Visual-Words Representations

SeqSLAM with Bag of Visual Words for Appearance Based Loop Closure Detection

Explore related subjects

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now