Abstract
Visual big data is an essential and significant research topic, due to its diverse applications. In this paper, a new visual detection method for transportation is proposed based on probabilistic latent semantic analysis with visual data. We detect the distinctiveness by integrating three steps as follows: first, representing the co-ocurrence matrix of images, which were vectorized using the bag of visual words (BoVW) framework; then calculating the histograms of the visual words of each class; and finally applying the test images as the visual words. A multilayer perceptron (MLP) is used as the classification method in our system. The visual words are extracted by sampling the patches from the current image. A new topology of the neural network for the BoVW model is proposed, and management of the learning rate by reducing at specific iterations is exploited. The Probabilistic latent semantic analysis (PLSA) is compared to the MLP using the Caltech 256 datasets. The classes used include cars, motorbikes, and horses. The results of the experiment show that the MLP outperforms current methods in predicting transportation objects, and properly approximates the transportation detection function with extracted local features. It shows that the proposed method yields about 4.4% higher accuracy than the conventional PLSA for all classes.
Similar content being viewed by others
References
Han H, Han Q, Li X, Gu J (2013) Hierarchical spatial pyramid max pooling based on sift features and sparse coding for image classification. IET Comput Vis 7(2):144–150
Ji Z (2013) Decoupling sparse coding with fusion of fisher vectors and scalable svms for large-scale visual recognition. In: Proceedings of the IEEE Conf Computer Vision and Pattern Recognition, pp 450–457
Parkhi OM, Vedaldi A, Zisserman A, Jawahar C (2012) Cats and dogs. In: Proceedings of the IEEE Conf Computer Vision and Pattern Recognition, pp 3498–3505
Fergus R, Fei-Fei L, Perona P, Zisserman A (2005) Learning object categories from google’s image search, vol 2, pp 1816–1823
Hofmann T (1999) Probabilistic latent semantic indexing. In: Proceedings of the International ACM SIGIR Conf Research and Development in Information Retrieval, pp 50–57
Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022
Zhong C, Miao Z (2014) Modeling correlation between multi-modal continuous words for plsa-based video classification. In: Proceedings of the International Conference on Image Processing, pp 4304–4308
Pliakos K, Kotropoulos C (2014) Plsa driven image annotation, classification, and tourism recommendation. In: Proceedings of the International Conference on Image Processing, pp 3003–3007
Fergus R (2005) Visual object category recognition
Choi HJ, Lee YS, Shim D-S, Lee CG, Choi KN (2016) Effective pedestrian detection using deformable part model based on human model. Int J Control Autom Syst 14(6):1618–1625
Sivic J, Russell BC, Efros AA, Zisserman A, Freeman WT (2005) Discovering object categories in image collections. In: Proceedings of the IEEE Conference on Computer Vision
Bosch A, Zisserman A, Muñoz X (2006) Scene classification via plsa. In: Proceedings of the European Conference on Computer Vision, pp 517–530
Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. Proc IEEE Conf Comput Vis Pattern Recognit 1:I–I
Bui TQ, Vu TT, Hong K-S (2016) Extraction of sparse features of color images in recognizing objects. Int J Control Autom Syst 14(2):616–627
Whoang I, Kim JH, Choi KN (2012) Object tracking using maximum colour distance under illumination change. Int J Adv Robot Syst 9(5):212
Chang SH, Shim D-S, Kim H-Y, Choi K-N (2012) Object motion tracking using a moving direction estimate and color upyears. Int J Control Autom Syst 10(1):136–142
Kim J, Lee GH, Jung JJ, Choi KN (2017) Real-Time Head Pose Estimation Framework for Mobile Devices. Mobile Networks and Applications 22(4):634–641
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. Proc IEEE Conf Comput Vis Pattern Recognit 1:886–893
Bay H, Tuytelaars T, Van Gool L (2006) Surf Speeded up robust features. In: Proceedings of the European Conference on Computer Vision, pp 404–417
Mu Y, Yan S, Liu Y, Huang T, Zhou B (2008) Discriminative local binary patterns for human detection in personal album. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1–8
Bui KN, Jung JJ (2018) Internet of agents framework for connected vehicles: A case study on distributed traffic control system. J Parallel Distrib Comput 116:89–95
Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Math Control Signals Syst (MCSS) 2(4):303–314
Fei-Fei L, Fergus R, Perona P (2007) Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories. Comp Vision Image Underst 106(1):59–70
Griffin G, Holub A, Perona P (2007) Caltech-256 object category dataset. California Institute of Technology
Harris C, Stephens M (1988) A combined corner and edge detector. Alvey Vision Conference 15(50):10–5244
Lowe DG (2001) Local feature view clustering for 3d object recognition. Proc IEEE Conf Comput Vis Pattern Recognit 1:I–I
Csurka G, Dance C, Fan L, Willamowski J, Bray C (2004) Visual categorization with bags of keypoints. Proceedings of the European Conference on Computer Vision 1(1–22):1–2
Fei-Fei L, Perona P (2005) A bayesian hierarchical model for learning natural scene categories. Proc IEEE Conf Comput Vis Pattern Recognit 2:524–531
Lowe DG (1999) Object recognition from local scale-invariant features. Proc of the IEEE Conf Computer Vision 2:1150– 1157
Sivic J, Zisserman A et al (2003) Video google: a text retrieval approach to object matching in videos. Proc of the IEEE Conf Computer Vision 2(1470):1470–1477
Goodfellow I , Bengio Y, Courville A (2016) Deep Learning. MIT Press, Cambridge. http://www.deeplearningbook.org
Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines, pp 807–814
Orgaz GB, Jung JJ, Camacho D (2016) Social big data: Recent achievements and new challenges. Information Fusion 28:45– 59
Acknowledgements
This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (NRF-2010-0025512).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Song, H.C., Choi, K.N. Transportation Object Detection with Bag of Visual Words Model by PLSA and MLP. Mobile Netw Appl 23, 1103–1110 (2018). https://doi.org/10.1007/s11036-018-1075-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11036-018-1075-2