Transportation Object Detection with Bag of Visual Words Model by PLSA and MLP

Song, Hyun Chul; Choi, Kwang Nam

doi:10.1007/s11036-018-1075-2

Transportation Object Detection with Bag of Visual Words Model by PLSA and MLP

Published: 11 July 2018

Volume 23, pages 1103–1110, (2018)
Cite this article

Mobile Networks and Applications Aims and scope Submit manuscript

Hyun Chul Song¹ &
Kwang Nam Choi¹

267 Accesses
4 Citations
Explore all metrics

Abstract

Visual big data is an essential and significant research topic, due to its diverse applications. In this paper, a new visual detection method for transportation is proposed based on probabilistic latent semantic analysis with visual data. We detect the distinctiveness by integrating three steps as follows: first, representing the co-ocurrence matrix of images, which were vectorized using the bag of visual words (BoVW) framework; then calculating the histograms of the visual words of each class; and finally applying the test images as the visual words. A multilayer perceptron (MLP) is used as the classification method in our system. The visual words are extracted by sampling the patches from the current image. A new topology of the neural network for the BoVW model is proposed, and management of the learning rate by reducing at specific iterations is exploited. The Probabilistic latent semantic analysis (PLSA) is compared to the MLP using the Caltech 256 datasets. The classes used include cars, motorbikes, and horses. The results of the experiment show that the MLP outperforms current methods in predicting transportation objects, and properly approximates the transportation detection function with extracted local features. It shows that the proposed method yields about 4.4% higher accuracy than the conventional PLSA for all classes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bag of Visual Words Methodology in Remote Sensing—A Review

Bag-of-Visual-Words codebook generation using deep features for effective classification of imbalanced multi-class image datasets

Article 10 March 2021

Classification of Rail Welding Defects Based on the Bag of Visual Words Approach

References

Han H, Han Q, Li X, Gu J (2013) Hierarchical spatial pyramid max pooling based on sift features and sparse coding for image classification. IET Comput Vis 7(2):144–150
Article Google Scholar
Ji Z (2013) Decoupling sparse coding with fusion of fisher vectors and scalable svms for large-scale visual recognition. In: Proceedings of the IEEE Conf Computer Vision and Pattern Recognition, pp 450–457
Parkhi OM, Vedaldi A, Zisserman A, Jawahar C (2012) Cats and dogs. In: Proceedings of the IEEE Conf Computer Vision and Pattern Recognition, pp 3498–3505
Fergus R, Fei-Fei L, Perona P, Zisserman A (2005) Learning object categories from google’s image search, vol 2, pp 1816–1823
Hofmann T (1999) Probabilistic latent semantic indexing. In: Proceedings of the International ACM SIGIR Conf Research and Development in Information Retrieval, pp 50–57
Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022
MATH Google Scholar
Zhong C, Miao Z (2014) Modeling correlation between multi-modal continuous words for plsa-based video classification. In: Proceedings of the International Conference on Image Processing, pp 4304–4308
Pliakos K, Kotropoulos C (2014) Plsa driven image annotation, classification, and tourism recommendation. In: Proceedings of the International Conference on Image Processing, pp 3003–3007
Fergus R (2005) Visual object category recognition
Choi HJ, Lee YS, Shim D-S, Lee CG, Choi KN (2016) Effective pedestrian detection using deformable part model based on human model. Int J Control Autom Syst 14(6):1618–1625
Article Google Scholar
Sivic J, Russell BC, Efros AA, Zisserman A, Freeman WT (2005) Discovering object categories in image collections. In: Proceedings of the IEEE Conference on Computer Vision
Bosch A, Zisserman A, Muñoz X (2006) Scene classification via plsa. In: Proceedings of the European Conference on Computer Vision, pp 517–530
Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. Proc IEEE Conf Comput Vis Pattern Recognit 1:I–I
Google Scholar
Bui TQ, Vu TT, Hong K-S (2016) Extraction of sparse features of color images in recognizing objects. Int J Control Autom Syst 14(2):616–627
Article Google Scholar
Whoang I, Kim JH, Choi KN (2012) Object tracking using maximum colour distance under illumination change. Int J Adv Robot Syst 9(5):212
Article Google Scholar
Chang SH, Shim D-S, Kim H-Y, Choi K-N (2012) Object motion tracking using a moving direction estimate and color upyears. Int J Control Autom Syst 10(1):136–142
Article Google Scholar
Kim J, Lee GH, Jung JJ, Choi KN (2017) Real-Time Head Pose Estimation Framework for Mobile Devices. Mobile Networks and Applications 22(4):634–641
Article Google Scholar
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. Proc IEEE Conf Comput Vis Pattern Recognit 1:886–893
Google Scholar
Bay H, Tuytelaars T, Van Gool L (2006) Surf Speeded up robust features. In: Proceedings of the European Conference on Computer Vision, pp 404–417
Mu Y, Yan S, Liu Y, Huang T, Zhou B (2008) Discriminative local binary patterns for human detection in personal album. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1–8
Bui KN, Jung JJ (2018) Internet of agents framework for connected vehicles: A case study on distributed traffic control system. J Parallel Distrib Comput 116:89–95
Article Google Scholar
Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Math Control Signals Syst (MCSS) 2(4):303–314
Article MathSciNet MATH Google Scholar
Fei-Fei L, Fergus R, Perona P (2007) Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories. Comp Vision Image Underst 106(1):59–70
Article Google Scholar
Griffin G, Holub A, Perona P (2007) Caltech-256 object category dataset. California Institute of Technology
Harris C, Stephens M (1988) A combined corner and edge detector. Alvey Vision Conference 15(50):10–5244
Google Scholar
Lowe DG (2001) Local feature view clustering for 3d object recognition. Proc IEEE Conf Comput Vis Pattern Recognit 1:I–I
Google Scholar
Csurka G, Dance C, Fan L, Willamowski J, Bray C (2004) Visual categorization with bags of keypoints. Proceedings of the European Conference on Computer Vision 1(1–22):1–2
Google Scholar
Fei-Fei L, Perona P (2005) A bayesian hierarchical model for learning natural scene categories. Proc IEEE Conf Comput Vis Pattern Recognit 2:524–531
Google Scholar
Lowe DG (1999) Object recognition from local scale-invariant features. Proc of the IEEE Conf Computer Vision 2:1150– 1157
Google Scholar
Sivic J, Zisserman A et al (2003) Video google: a text retrieval approach to object matching in videos. Proc of the IEEE Conf Computer Vision 2(1470):1470–1477
Article Google Scholar
Goodfellow I , Bengio Y, Courville A (2016) Deep Learning. MIT Press, Cambridge. http://www.deeplearningbook.org
MATH Google Scholar
Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines, pp 807–814
Orgaz GB, Jung JJ, Camacho D (2016) Social big data: Recent achievements and new challenges. Information Fusion 28:45– 59
Article Google Scholar

Download references

Acknowledgements

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (NRF-2010-0025512).

Author information

Authors and Affiliations

School of Computer Science and Engineering, Chung-Ang University, Chung-Ang, Korea
Hyun Chul Song & Kwang Nam Choi

Authors

Hyun Chul Song
View author publications
You can also search for this author in PubMed Google Scholar
Kwang Nam Choi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kwang Nam Choi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Song, H.C., Choi, K.N. Transportation Object Detection with Bag of Visual Words Model by PLSA and MLP. Mobile Netw Appl 23, 1103–1110 (2018). https://doi.org/10.1007/s11036-018-1075-2

Download citation

Published: 11 July 2018
Issue Date: August 2018
DOI: https://doi.org/10.1007/s11036-018-1075-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Transportation Object Detection with Bag of Visual Words Model by PLSA and MLP

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Bag of Visual Words Methodology in Remote Sensing—A Review

Bag-of-Visual-Words codebook generation using deep features for effective classification of imbalanced multi-class image datasets

Classification of Rail Welding Defects Based on the Bag of Visual Words Approach

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Transportation Object Detection with Bag of Visual Words Model by PLSA and MLP

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Bag of Visual Words Methodology in Remote Sensing—A Review

Bag-of-Visual-Words codebook generation using deep features for effective classification of imbalanced multi-class image datasets

Classification of Rail Welding Defects Based on the Bag of Visual Words Approach

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation