Abstract
The main objective of this research paper is to design a system which would generate multimodal, nonparametric Bayesian model, and multilayered probability latent semantic analysis (pLSA)-based visual dictionary (BM-MpLSA). Advancement in technology and the exuberance of sports lovers have necessitated a requirement for automatic action recognition in the live video seed of sports. The fundamental requirement for such model is the creation of visual dictionary for each sports domain. This multimodal nonparametric model has two novel co-occurrence matrix creation—one for image feature vector and the other for textual entities. This matrix provides a basic scaling parameter for the unobserved random variables, and it is an extension of multilayered pLSA-based visual dictionary creation. This paper precisely concentrates on the creation of visual dictionary for Basketball. From the sports event images, the feature vector extracted is modified as SIFT and MPEG 7’s-based dominant color, color layout, scalable color and edge histograms. After quantization and analysis of these vector values, the visual vocabulary would be created by integrating them into the domain specific visual ontology for semantic understanding. The accuracy rate of this work is compared with respect to the action held on image based on performance.
Similar content being viewed by others
References
A. Agarwal , B. Triggs, Hyperfeatures—multilevel local coding for visual recognition. In: European Conference on Computer Vision, pp. 30–43. Springer, Berlin (2006)
Y. Alqasrawi, D. Neagu, P.I. Cowling, Fusing integrated visual vocabularies-based bag of visual words and weighted colour moments on spatial pyramid layout for natural scene image classification. Signal Image Video Process. 7(4), 759–775 (2013)
I. Dimitrovski, D. Kocev, S. Loskovska, S. Dz̆eroski, Improving bag-of-visual-words image retrieval with predictive clustering trees. Inf. Sci. 329, 851–865 (2016)
J.M. dos Santos, E.S. De Moura, A.S. Da Silva, R. Da Silva Torres, Color and texture applied to a signature-based bag of visual words method for image retrieval. Multimed. Tools Appl. 76(15), 16855–16872 (2017)
F.S.K. Elfiky, J. Van De Weijer, J. Gonzalez, Discriminative compact pyramids for object and scene recognition. Pattern Recognit. 45(4), 1627–1636 (2012)
C. Gao, X. Zhang, H. Wang, A combined method for multi-class image semantic segmentation. IEEE Trans. Consum. Electron. 58(2), 596–604 (2012)
M.J. Hao, J. Zhu, M.T. Lyu, I. King, Bridging the semantic gap between image contents and tags. IEEE Trans. Multimed. 12(5), 462–470 (2010)
S.P. Kesorn, An Enhanced bag-of-visual words vector space model to represent visual content in athletics images. IEEE Trans. Multimed. 14(1), 211–222 (2012)
M. Kherfi, M. Lamine, D. Ziou, Image collection organization and its application to indexing, browsing, summarization and semantic retrieval. IEEE Trans. Multimed. 9(4), 893–900 (2007)
R. Lienhart, S. Romberg, E. Hörster, Multilayer pLSA for multimodal image retrieval. In: Proceedings of the ACM International Conference on Image and Video Retrieval, p. 9. ACM (2009)
W.-C. Lin, C.-F. Tsai, Z.-Y. Chen, S.-W. Ke, Keypoint selection for efficient bag-of-words feature generation and effective image classification. Inf. Sci. 329, 33–51 (2016)
B.S. Manjunath, P. Salembier, T. Sikora, Introduction to MPEG-7: Multimedia Content Description Interface, vol. 1 (Wiley, Hoboken, 2002)
J.S. Martin, A. Jasra, S.S. Singh, N. Whiteley, P. Del Moral, E. McCoy, Approximate Bayesian computation for smoothing. Stoch. Anal. Appl. 32(3), 397–420 (2014)
K. Mikolajczyk, C. Schmid, A performance evaluation of local descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 27(10), 1615–1630 (2005)
R.I. Minu, G. Nagarajan, A. Suresh, A. Jayanthila Devi, Cognitive computational semantic for high resolution image interpretation using artificial neural network. Biomed. Res. India 27, S306–S309 (2016)
G. Nagarajan, R.I. Minu, Fuzzy ontology based multimodal semantic information retrieval. Procedia Comput. Sci. J. 48, 101–106 (2015)
K. Soomro, R.A. Zamir, Action Recognition in Realistic Sports Videos. Computer Vision in Sports (Springer, Berlin, 2014), pp. 181–208
D. Tian, X. Zhao, Z. Shi, An efficient refining image annotation technique by combining probabilistic latent semantic analysis and random walk model. Intell. Autom. Soft Comput. 20(3), 335–45 (2014)
Z. Tianzhu, J. Liu, S. Liu, C. Xu, H. Lu, Boosted exemplar learning for action recognition and annotation. IEEE Trans. Circuits Syst. Video Technol. 21(7), 853–866 (2011)
C.-F. Tsai, Bag-of-words representation in image annotation: a review. ISRN Artif. Intell. 1, 1–7 (2012)
J.R.R. Uijlings, A.W. Smeulders, R.J. Scha, Real time visual concept classification. IEEE Trans. Multimed. 12(7), 665–681 (2010)
J. Zhang, D. Li, W. Hu, Z. Chen, Y. Yuan, Multilabel image annotation based on double-layer PLSA model. Sci. World J. (2014)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Nagarajan, G., Minu, R.I. & Jayanthila Devi, A. Optimal Nonparametric Bayesian Model-Based Multimodal BoVW Creation Using Multilayer pLSA. Circuits Syst Signal Process 39, 1123–1132 (2020). https://doi.org/10.1007/s00034-019-01307-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00034-019-01307-7