Abstract
Video analytics refers to the process of automatically analysing a video for spatial and temporal events. Effective video analytics require exploitation of genre-related information leading to knowledge-based video analysis. The proposed model implements an ontology for inferring genre-specific information namely GenSpecVidOnt- Genre Specific Video Ontology, based on a combined multimodal feature set and ontology inference. The collective set of features utilized for detecting the video genre are dominant colour, aural and visual quality, camera status, shot boundaries and redundancy factor. The proposed ontology is one of a kind that bridges the semantic gap prevailing in the current video analytics by exploiting genre related information of videos. The architecture of the ontology has been verified and validated proving the efficacy of the architecture. Quantitative evaluations of the proposed multimodal genre detection phase proves that the model could achieve an elevated precision, recall, f-score, and accuracy values of 73.16%, 75.68%, 72.06% and 71.43% respectively compared to the prominent machine learning based classifiers. The results further substantiate the relevance of the multimodal feature set identified in the proposed model for accurate genre detection along with the superiority of the model in extracting domain-related information for knowledge-based video analytics based on ontology inferences.
Similar content being viewed by others
Data availability
The datasets generated and analyzed during the current study are available from the corresponding author on reasonable request
References
Álvarez F, Sánchez F, Hernández-Peñaloza G, Jiménez D, Menéndez JM, Cisneros G (2019) On the influence of low-level visual features in film classification. PLoS One 14(2):e0211406
Apostolidis, E, Adamantidou, E, Metsai, AI, Mezaris, V, Patras, I (2021) Video summarization using deep neural networks: a survey. arXiv preprint arXiv:2101.06072
Bakels, JH, Scherer, T, Stratil, J, Agt-Rickauer, H (2020) AdA Filmontology-a machine-readable film analysis vocabulary for video annotation. In DH
Bellard, F (2016) Ffmpeg filters documentation. https://ffmpeg.org/ffmpeg-filters.html#hqdn3d-1. [accessed 28-Aug-2020]
Bornia J, Mahmoudi SA, Frihida A, Manneback P (2018) Towards a semantic video analysis using deep learning and ontology. In: 2018 4th international conference on cloud computing technologies and applications (Cloudtech). IEEE, pp 1–6. https://doi.org/10.1109/CloudTech.2018.8713340
Cavaliere D, Loia V, Senatore S (2019) Towards an ontology design pattern for UAV video content analysis. IEEE Access 7:105342–105353. https://doi.org/10.1109/ACCESS.2019.2932442
Choroś K (2018) Video genre classification based on length analysis of temporally aggregated video shots. In: International conference on computational collective intelligence. Springer, Cham, pp 509–518. https://doi.org/10.1007/978-3-319-98446-9_48
Choroś K (2019) Fast method of video genre categorization for temporally aggregated broadcast videos. J Intell Fuzzy Syst 37(6):7657–7667
Dandashi A, Alja’am JM (2018) Video classification methods: multimodal techniques. In: In recent trends in computer applications (pp. 33–51). Springer, Cham
Daudpota SM, Muhammad A, Baber J (2019) Video genre identification using clustering-based shot detection algorithm. SIViP 13(7):1413–1420
Doulaty, M, Saz, O, Ng, RW, Hain, T (2016) Automatic genre and show identification of broadcast media. arXiv preprint arXiv:1606.03333
Ekenel HK, Semela T (2013) Multimodal genre classification of TV programs and YouTube videos. Multimed Tools Appl 63(2):547–567
Fernández-López, M, Gómez-Pérez, A, Juristo, N (1997) Methontology: from ontological art towards ontological engineering, AAAI97 spring symposium, Stanford, USA, March 1997, p. 33--40
Godbehere, AB, Matsukawa, A, Goldberg, K (2012) Visual tracking of human visitors under variable-lighting conditions for a responsive audio art installation. In 2012 American control conference (ACC) (pp. 4305-4312). IEEE
Greco, L, Ritrovato, P, Vento, M (2017) Advanced video analytics: an ontology-based approach. In proceedings of the 7th international conference on web intelligence, mining and semantics (pp. 1-6)
Hearst MA, Dumais ST, Osuna E, Platt J, Scholkopf B (1998) Support vector machines. IEEE Intell Syst Appl 13(4):18–28
Hlomani H, Stacey D (2014) Approaches, methods, metrics, measures, and subjectivity in ontology evaluation: a survey. Semant Web J 1(5):1–11
Ibrahim ZAA, Saab M, Sbeity I (2019) VideoToVecs: a new video representation based on deep learning techniques for video classification and clustering. SN Appl Sci 1(6):1–7
KaewTraKulPong P, Bowden R (2002) An improved adaptive background mixture model for real-time tracking with shadow detection. In video-based surveillance systems. Springer, Boston, pp 135–144
Kahar NF, Izquierdo E (2017) Ontology-based analysis of CCTV data. In 7th Latin American conference on networked and electronic media (LACNEM 2017). IET, pp 62–67. https://doi.org/10.1049/ic.2017.0037
Kaushal, V, Subramanian, S, Kothawade, S, Iyer, R, Ramakrishnan, G (2019) A framework towards domain specific video summarization. In 2019 IEEE winter conference on applications of computer vision (WACV) (pp. 666-675). IEEE
Khoorshed NK (2021) Car surveillance video summarization based on Car plate detection. Turkish J Comput Math Educ (TURCOMAT) 12(6):1132–1144
Kim S, Georgiou P, Narayanan S (2013) On-line genre classification of TV programs using audio content. In: 2013 IEEE international conference on acoustics, speech and signal processing. IEEE, pp 798–802. https://doi.org/10.1109/ICASSP.2013.6637758
Lamy JB (2017) Owlready: ontology-oriented programming in Python with automatic classification and high level constructs for biomedical ontologies. Artif Intell Med 80:11–28. https://doi.org/10.1016/j.artmed.2017.07.002
Lovrencic S, Cubrilo M (2008) Ontology evaluation-comprising verification and validation. In: Central European conference on information and intelligent systems. Faculty of Organization and Informatics, Varazdin, p 1
Meng, J, Wang, H, Yuan, J, Tan, YP (2016) From keyframes to key objects: video summarization by representative object proposal selection. In proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1039-1048)
Money AG, Agius H (2008) Video summarisation: a conceptual framework and survey of the state of the art. J Vis Commun Image Represent 19(2):121–143
Musen MA (2015) The protégé project: a look back and a look forward. AI Matters 1(4):4–12
Park W, Han M, Son JW, Kim SJ (2017) Design of scene knowledge base system based on domain ontology. In: 2017 19th international conference on advanced communication technology (ICACT). IEEE, pp 560–562. https://doi.org/10.23919/ICACT.2017.7890152
Patel AS, Merlino G, Bruneo D, Puliafito A, Vyas OP, Ojha M (2021) Video representation and suspicious event detection using semantic technologies. Semantic web, pp 467–491
Potapov D, Douze M, Harchaoui Z, Schmid C (2014) Category-specific video summarization. In: European conference on computer vision. Springer, Cham, pp 540–555. https://doi.org/10.1007/978-3-319-10599-4_35
Raut, V, Gunjan, R (2020) Video summarization approaches in wireless capsule endoscopy: a review. In E3S web of conferences (Vol. 170, p. 03005). EDP sciences
Rouvier M, Oger S, Linares G, Matrouf D, Merialdo B, Li Y (2015) Audio-based video genre identification. IEEE/ACM Trans Audio, Speech, Lang Process 23(6):1031–1041. https://doi.org/10.1109/TASLP.2014.2387411
Sageder G, Zaharieva M, Breiteneder C (2016) Group feature selection for audio-based video genre classification. In: In international conference on multimedia modeling. Springer, Cham, pp 29–41. https://doi.org/10.1007/978-3-319-27671-7_3
Saz O, Doulaty M, Hain T (2014) Background-tracking acoustic features for genre identification of broadcast shows. In: 2014 IEEE spoken language technology workshop (SLT). IEEE, pp 118–123. https://doi.org/10.48550/arXiv.1509.04934
Sikos, LF (2017) Utilizing multimedia ontologies in video scene interpretation via information fusion and automated reasoning. In: 2017 federated conference on computer science and information systems (FedCSIS). IEEE, pp 91–-98. https://doi.org/10.15439/2017F66
Sikos LF (2018) VidOnt: a core reference ontology for reasoning over video scenes. J Inf Telecommun 2(2):192–204
Sobhani F, Chandramouli K, Zhang Q, Izquierdo E (2016) Formal representation of events in a surveillance domain ontology. In: 2016 IEEE international conference on image processing (ICIP). IEEE, pp 913–917. https://doi.org/10.1109/ICIP.2016.7532490
Sobhani, F, Izquierdo, E, Piatrik, T (2017) Ontology-based forensic event detection using inference rules. In 2017 international conference on engineering, technology and innovation (ICE/ITMC) (pp. 584-591). IEEE
Sreeja MU, Kovoor BC (2019) Towards genre-specific frameworks for video summarisation: a survey. J Vis Commun Image Represent 62:340–358. https://doi.org/10.1016/j.jvcir.2019.06.004
Sreeja MU, Kovoor BC (2021) A unified model for egocentric video summarization: an instance-based approach. Comput Electr Eng 92:107161
Tani MYK, Lablack A, Ghomari A, Bilasco IM (2014) Events detection using a video-surveillance ontology and a rule-based approach. In: In European conference on computer vision (pp. 299-308). Springer, Cham
Tani MYK, Ghomari A, Lablack A, Bilasco IM (2017) OVIS: ontology video surveillance indexing and retrieval system. Int J Multimed Info Retriev 6(4):295–316. https://doi.org/10.1007/s13735-017-0133-z
Varghese J, Nair KR (2019) A novel video genre classification algorithm by keyframe relevance. In: In information and communication Technology for Intelligent Systems (pp. 685–696). Springer, Singapore
Vizcarra, J, Nishimura, S, Fukuda, K (2021) Ontology-based human behavior indexing with multimodal video data. In 2021 IEEE 15th international conference on semantic computing (ICSC) (pp. 262-267). IEEE
Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612
Wickstrom F (2013) Getting started with smart-M3 using Python. In TUCS Technical Reports, p. 50
Wu L, Yang Z, He J, Jian M, Xu Y, Xu D, Chen CW (2019) Ontology-based global and collective motion patterns for event classification in basketball videos. IEEE Trans Circ Syst Vid Technol 30(7):2178–2190. https://doi.org/10.1109/TCSVT.2019.2912529
Zhang, K, Chao, WL, Sha, F, Grauman, K (2016) Summary transfer: exemplar-based subset selection for video summarization. In proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1059-1067)
Zivkovic Z, Van Der Heijden F (2006) Efficient adaptive density estimation per image pixel for the task of background subtraction. Pattern Recogn Lett 27(7):773–780. https://doi.org/10.1016/j.patrec.2005.11.005
Funding
The authors did not receive support from any organization for the submitted work.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Consent for publication
We give our consent for this article to be published in the Springer Journal of Multimedia Tools and Applications.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sreeja, M.U., Kovoor, B.C. GenSpecVidOnt: a reference ontology for knowledge based video analytics with multimodal genre detection. Multimed Tools Appl 82, 35815–35852 (2023). https://doi.org/10.1007/s11042-023-15040-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-15040-6