GenSpecVidOnt: a reference ontology for knowledge based video analytics with multimodal genre detection

Sreeja, M. U.; Kovoor, Binsu C.

doi:10.1007/s11042-023-15040-6

GenSpecVidOnt: a reference ontology for knowledge based video analytics with multimodal genre detection

Published: 11 March 2023

Volume 82, pages 35815–35852, (2023)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

M. U. Sreeja¹ &
Binsu C. Kovoor¹

141 Accesses
1 Altmetric
Explore all metrics

Abstract

Video analytics refers to the process of automatically analysing a video for spatial and temporal events. Effective video analytics require exploitation of genre-related information leading to knowledge-based video analysis. The proposed model implements an ontology for inferring genre-specific information namely GenSpecVidOnt- Genre Specific Video Ontology, based on a combined multimodal feature set and ontology inference. The collective set of features utilized for detecting the video genre are dominant colour, aural and visual quality, camera status, shot boundaries and redundancy factor. The proposed ontology is one of a kind that bridges the semantic gap prevailing in the current video analytics by exploiting genre related information of videos. The architecture of the ontology has been verified and validated proving the efficacy of the architecture. Quantitative evaluations of the proposed multimodal genre detection phase proves that the model could achieve an elevated precision, recall, f-score, and accuracy values of 73.16%, 75.68%, 72.06% and 71.43% respectively compared to the prominent machine learning based classifiers. The results further substantiate the relevance of the multimodal feature set identified in the proposed model for accurate genre detection along with the superiority of the model in extracting domain-related information for knowledge-based video analytics based on ontology inferences.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Video summarization using deep learning techniques: a detailed analysis and investigation

Article 15 March 2023

A survey on video-based Human Action Recognition: recent updates, datasets, challenges, and applications

Article 25 September 2020

A review on deepfake generation and detection: bibliometric analysis

Article 18 March 2024

Data availability

The datasets generated and analyzed during the current study are available from the corresponding author on reasonable request

References

Álvarez F, Sánchez F, Hernández-Peñaloza G, Jiménez D, Menéndez JM, Cisneros G (2019) On the influence of low-level visual features in film classification. PLoS One 14(2):e0211406
Article Google Scholar
Apostolidis, E, Adamantidou, E, Metsai, AI, Mezaris, V, Patras, I (2021) Video summarization using deep neural networks: a survey. arXiv preprint arXiv:2101.06072
Bakels, JH, Scherer, T, Stratil, J, Agt-Rickauer, H (2020) AdA Filmontology-a machine-readable film analysis vocabulary for video annotation. In DH
Bellard, F (2016) Ffmpeg filters documentation. https://ffmpeg.org/ffmpeg-filters.html#hqdn3d-1. [accessed 28-Aug-2020]
Bornia J, Mahmoudi SA, Frihida A, Manneback P (2018) Towards a semantic video analysis using deep learning and ontology. In: 2018 4th international conference on cloud computing technologies and applications (Cloudtech). IEEE, pp 1–6. https://doi.org/10.1109/CloudTech.2018.8713340
Cavaliere D, Loia V, Senatore S (2019) Towards an ontology design pattern for UAV video content analysis. IEEE Access 7:105342–105353. https://doi.org/10.1109/ACCESS.2019.2932442
Choroś K (2018) Video genre classification based on length analysis of temporally aggregated video shots. In: International conference on computational collective intelligence. Springer, Cham, pp 509–518. https://doi.org/10.1007/978-3-319-98446-9_48
Choroś K (2019) Fast method of video genre categorization for temporally aggregated broadcast videos. J Intell Fuzzy Syst 37(6):7657–7667
Article Google Scholar
Dandashi A, Alja’am JM (2018) Video classification methods: multimodal techniques. In: In recent trends in computer applications (pp. 33–51). Springer, Cham
Google Scholar
Daudpota SM, Muhammad A, Baber J (2019) Video genre identification using clustering-based shot detection algorithm. SIViP 13(7):1413–1420
Article Google Scholar
Doulaty, M, Saz, O, Ng, RW, Hain, T (2016) Automatic genre and show identification of broadcast media. arXiv preprint arXiv:1606.03333
Ekenel HK, Semela T (2013) Multimodal genre classification of TV programs and YouTube videos. Multimed Tools Appl 63(2):547–567
Article Google Scholar
Fernández-López, M, Gómez-Pérez, A, Juristo, N (1997) Methontology: from ontological art towards ontological engineering, AAAI97 spring symposium, Stanford, USA, March 1997, p. 33--40
Godbehere, AB, Matsukawa, A, Goldberg, K (2012) Visual tracking of human visitors under variable-lighting conditions for a responsive audio art installation. In 2012 American control conference (ACC) (pp. 4305-4312). IEEE
Greco, L, Ritrovato, P, Vento, M (2017) Advanced video analytics: an ontology-based approach. In proceedings of the 7th international conference on web intelligence, mining and semantics (pp. 1-6)
Hearst MA, Dumais ST, Osuna E, Platt J, Scholkopf B (1998) Support vector machines. IEEE Intell Syst Appl 13(4):18–28
Article Google Scholar
Hlomani H, Stacey D (2014) Approaches, methods, metrics, measures, and subjectivity in ontology evaluation: a survey. Semant Web J 1(5):1–11
Google Scholar
Ibrahim ZAA, Saab M, Sbeity I (2019) VideoToVecs: a new video representation based on deep learning techniques for video classification and clustering. SN Appl Sci 1(6):1–7
Article Google Scholar
KaewTraKulPong P, Bowden R (2002) An improved adaptive background mixture model for real-time tracking with shadow detection. In video-based surveillance systems. Springer, Boston, pp 135–144
Kahar NF, Izquierdo E (2017) Ontology-based analysis of CCTV data. In 7th Latin American conference on networked and electronic media (LACNEM 2017). IET, pp 62–67. https://doi.org/10.1049/ic.2017.0037
Kaushal, V, Subramanian, S, Kothawade, S, Iyer, R, Ramakrishnan, G (2019) A framework towards domain specific video summarization. In 2019 IEEE winter conference on applications of computer vision (WACV) (pp. 666-675). IEEE
Khoorshed NK (2021) Car surveillance video summarization based on Car plate detection. Turkish J Comput Math Educ (TURCOMAT) 12(6):1132–1144
Article Google Scholar
Kim S, Georgiou P, Narayanan S (2013) On-line genre classification of TV programs using audio content. In: 2013 IEEE international conference on acoustics, speech and signal processing. IEEE, pp 798–802. https://doi.org/10.1109/ICASSP.2013.6637758
Lamy JB (2017) Owlready: ontology-oriented programming in Python with automatic classification and high level constructs for biomedical ontologies. Artif Intell Med 80:11–28. https://doi.org/10.1016/j.artmed.2017.07.002
Lovrencic S, Cubrilo M (2008) Ontology evaluation-comprising verification and validation. In: Central European conference on information and intelligent systems. Faculty of Organization and Informatics, Varazdin, p 1
Meng, J, Wang, H, Yuan, J, Tan, YP (2016) From keyframes to key objects: video summarization by representative object proposal selection. In proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1039-1048)
Money AG, Agius H (2008) Video summarisation: a conceptual framework and survey of the state of the art. J Vis Commun Image Represent 19(2):121–143
Article Google Scholar
Musen MA (2015) The protégé project: a look back and a look forward. AI Matters 1(4):4–12
Article Google Scholar
Park W, Han M, Son JW, Kim SJ (2017) Design of scene knowledge base system based on domain ontology. In: 2017 19th international conference on advanced communication technology (ICACT). IEEE, pp 560–562. https://doi.org/10.23919/ICACT.2017.7890152
Patel AS, Merlino G, Bruneo D, Puliafito A, Vyas OP, Ojha M (2021) Video representation and suspicious event detection using semantic technologies. Semantic web, pp 467–491
Potapov D, Douze M, Harchaoui Z, Schmid C (2014) Category-specific video summarization. In: European conference on computer vision. Springer, Cham, pp 540–555. https://doi.org/10.1007/978-3-319-10599-4_35
Raut, V, Gunjan, R (2020) Video summarization approaches in wireless capsule endoscopy: a review. In E3S web of conferences (Vol. 170, p. 03005). EDP sciences
Rouvier M, Oger S, Linares G, Matrouf D, Merialdo B, Li Y (2015) Audio-based video genre identification. IEEE/ACM Trans Audio, Speech, Lang Process 23(6):1031–1041. https://doi.org/10.1109/TASLP.2014.2387411
Sageder G, Zaharieva M, Breiteneder C (2016) Group feature selection for audio-based video genre classification. In: In international conference on multimedia modeling. Springer, Cham, pp 29–41. https://doi.org/10.1007/978-3-319-27671-7_3
Saz O, Doulaty M, Hain T (2014) Background-tracking acoustic features for genre identification of broadcast shows. In: 2014 IEEE spoken language technology workshop (SLT). IEEE, pp 118–123. https://doi.org/10.48550/arXiv.1509.04934
Sikos, LF (2017) Utilizing multimedia ontologies in video scene interpretation via information fusion and automated reasoning. In: 2017 federated conference on computer science and information systems (FedCSIS). IEEE, pp 91–-98. https://doi.org/10.15439/2017F66
Sikos LF (2018) VidOnt: a core reference ontology for reasoning over video scenes. J Inf Telecommun 2(2):192–204
Google Scholar
Sobhani F, Chandramouli K, Zhang Q, Izquierdo E (2016) Formal representation of events in a surveillance domain ontology. In: 2016 IEEE international conference on image processing (ICIP). IEEE, pp 913–917. https://doi.org/10.1109/ICIP.2016.7532490
Sobhani, F, Izquierdo, E, Piatrik, T (2017) Ontology-based forensic event detection using inference rules. In 2017 international conference on engineering, technology and innovation (ICE/ITMC) (pp. 584-591). IEEE
Sreeja MU, Kovoor BC (2019) Towards genre-specific frameworks for video summarisation: a survey. J Vis Commun Image Represent 62:340–358. https://doi.org/10.1016/j.jvcir.2019.06.004
Article Google Scholar
Sreeja MU, Kovoor BC (2021) A unified model for egocentric video summarization: an instance-based approach. Comput Electr Eng 92:107161
Article Google Scholar
Tani MYK, Lablack A, Ghomari A, Bilasco IM (2014) Events detection using a video-surveillance ontology and a rule-based approach. In: In European conference on computer vision (pp. 299-308). Springer, Cham
Google Scholar
Tani MYK, Ghomari A, Lablack A, Bilasco IM (2017) OVIS: ontology video surveillance indexing and retrieval system. Int J Multimed Info Retriev 6(4):295–316. https://doi.org/10.1007/s13735-017-0133-z
Article Google Scholar
Varghese J, Nair KR (2019) A novel video genre classification algorithm by keyframe relevance. In: In information and communication Technology for Intelligent Systems (pp. 685–696). Springer, Singapore
Google Scholar
Vizcarra, J, Nishimura, S, Fukuda, K (2021) Ontology-based human behavior indexing with multimodal video data. In 2021 IEEE 15th international conference on semantic computing (ICSC) (pp. 262-267). IEEE
Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612
Article Google Scholar
Wickstrom F (2013) Getting started with smart-M3 using Python. In TUCS Technical Reports, p. 50
Wu L, Yang Z, He J, Jian M, Xu Y, Xu D, Chen CW (2019) Ontology-based global and collective motion patterns for event classification in basketball videos. IEEE Trans Circ Syst Vid Technol 30(7):2178–2190. https://doi.org/10.1109/TCSVT.2019.2912529
Zhang, K, Chao, WL, Sha, F, Grauman, K (2016) Summary transfer: exemplar-based subset selection for video summarization. In proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1059-1067)
Zivkovic Z, Van Der Heijden F (2006) Efficient adaptive density estimation per image pixel for the task of background subtraction. Pattern Recogn Lett 27(7):773–780. https://doi.org/10.1016/j.patrec.2005.11.005
Article Google Scholar

Download references

Funding

The authors did not receive support from any organization for the submitted work.

Author information

Authors and Affiliations

Division of Information Technology, Cochin University of Science and Technology, Kochi, Kerala, India
M. U. Sreeja & Binsu C. Kovoor

Authors

M. U. Sreeja
View author publications
You can also search for this author in PubMed Google Scholar
Binsu C. Kovoor
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to M. U. Sreeja.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Consent for publication

We give our consent for this article to be published in the Springer Journal of Multimedia Tools and Applications.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Sreeja, M.U., Kovoor, B.C. GenSpecVidOnt: a reference ontology for knowledge based video analytics with multimodal genre detection. Multimed Tools Appl 82, 35815–35852 (2023). https://doi.org/10.1007/s11042-023-15040-6

Download citation

Received: 10 November 2021
Revised: 06 October 2022
Accepted: 27 February 2023
Published: 11 March 2023
Issue Date: September 2023
DOI: https://doi.org/10.1007/s11042-023-15040-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

GenSpecVidOnt: a reference ontology for knowledge based video analytics with multimodal genre detection

Abstract

Access this article

Similar content being viewed by others

Video summarization using deep learning techniques: a detailed analysis and investigation

A survey on video-based Human Action Recognition: recent updates, datasets, challenges, and applications

A review on deepfake generation and detection: bibliometric analysis

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Consent for publication

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

GenSpecVidOnt: a reference ontology for knowledge based video analytics with multimodal genre detection

Abstract

Access this article

Similar content being viewed by others

Video summarization using deep learning techniques: a detailed analysis and investigation

A survey on video-based Human Action Recognition: recent updates, datasets, challenges, and applications

A review on deepfake generation and detection: bibliometric analysis

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Consent for publication

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation