An automatic video annotation framework based on two level keyframe extraction mechanism

Aote, Shailendra S.; Potnurwar, Archana

doi:10.1007/s11042-018-6826-3

An automatic video annotation framework based on two level keyframe extraction mechanism

Published: 07 November 2018

Volume 78, pages 14465–14484, (2019)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Shailendra S. Aote¹ &
Archana Potnurwar²

766 Accesses
20 Citations
3 Altmetric
Explore all metrics

Abstract

Large increase in audio, video and digital data in the internet signifies the importance of video annotation techniques. This paper mainly deals with the development of a hybrid algorithm for automatic Video Annotation (VA). Another aim in developing the algorithm is to improve the performance and precision as well as to reduce the amount of time required to obtain the annotations. The overall process leads to the development of efficient techniques for shot detection followed by two level key frame extractions and saliency based residual approach for feature extraction. For all the stages in VA like shot detection, keyframe extraction and feature extraction, factors relating to improve the performance are addressed here. The combination of color histogram difference (CBD) and Edge change ratio (ECR) is used here; as these two are the most promising techniques in shot detection. The new idea is proposed to fine tune the keyframe extraction, which extracts keyframe in two levels. At first level, the first frame in the shot is considered as a keyframe. But to remove redundancy, it enters into second level and finds the optimal set of keyframes by using fuzzy c-means clustering technique. Colour and texture features are used for feature extraction. Here the Video annotation process is divided into two sections, training and testing. The weight vector is found in training stage. Based on this feature vector, the similarity array is calculated in testing phase which further finds corrected annotations. The proposed method is compared with OMG-SSL and MMT-MGO and results are found better on Trechvid dataset. The significance of using weight vector is also experimentally shown here.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Keyframe extraction using Pearson correlation coefficient and color moments

Article 18 December 2019

Fusion of Zero-Normalized Pixel Correlation Coefficient and Higher-Order Color Moments for Keyframe Extraction

An efficient method for video shot boundary detection and keyframe extraction using SIFT-point distribution histogram

Article 16 March 2016

References

Abdollahian G, Birinci M, Diaz-de-Maria F, Gabbouj M, Delp EJ (2011) A region-dependent image matching method for image and video annotation. In: 9th international workshop on content-based multimedia indexing (CBMI), 13–15 June 2011, pp 121–126. https://doi.org/10.1109/CBMI.2011.5972532
Adjeroh D, Lee MC, Banda N, Kandaswamy U (2009) Adaptive edge oriented shot boundary detection. Hindawi Publishing Corporation EURASIP Journal on Image and Video Processing 2009:1–13. https://doi.org/10.1155/2009/859371
Article Google Scholar
Angadi S, Naik V (2014) Entropy based fuzzy c means clustering and key frame extraction for sports video summarization. 2014 Fifth International Conference on Signal and Image Processing. https://doi.org/10.1109/ICSIP.2014.49
Bi J, Liu X, Lang B (2011) A novel shot boundary detection based on information theory using SVM. 4th International Congress on Image and Signal Processing. IEEE, pp 512-516, https://doi.org/10.1109/CISP.2011.6099941
Boreczky JS, Rowe LA (1996) Comparison of video shot boundary detection techniques. J Electron Imaging 5(2):122–128
Article Google Scholar
Brown LM (2010) Example-based color vehicle retrieval for surveillance. In: Seventh IEEE international conference on advanced video and signal based surveillance (AVSS), Aug 29–Sep 01 2010. IEEE, pp 91–96. https://doi.org/10.1109/AVSS.2010.59
Chasemani FF, Affendy LS, Mustapha N, Khalid F (2015) Automatic video annotation framework using concept detectors. J Appl Sci 15:256–263. https://doi.org/10.3923/jas.2015.256.263
Article Google Scholar
Hong-cai F, Xiao-juan Y, Wei M, Cao Y (2010) A shot boundary detection method based on color space. In: International conference on E-business and E-government, 7–10 May 2010. https://doi.org/10.1109/ICEE.2010.417
Huo Y, Zhang P, Wang Y (2014) Adaptive threshold video shot boundary detection algorithm based on progressive bisection strategy. Int J Inf Comput Sci 11(2):391–403. https://doi.org/10.12733/jics20102621
Kavasidis I, Palazzo S, Di Salvo R, Giordano D, Spampinato C (2013) An innovative web-based collaborative platform for video annotation. Multimedia Tools and Applications 70(1):413–432. https://doi.org/10.1007/s11042-013-1419-7
Khurana K, Chandak MB (2013) Video annotation methodology based on ontology for transportation domain. International Journal of Advanced Research in Computer Science and Software Engineering 3(6):540–548
Google Scholar
Lai J-L, Yi Y (2012) Key frame extraction based on visual attention model. Elsevier, Journal Of Visual Communication Image 23(1):114–125. https://doi.org/10.1016/j.jvcir.2011.08.005
Lei Y, Luo W, Wang Y (2012) Video Sequence Matching Based on the Invariance of Color Correlation. IEEE Transactions On Circuits And Systems For Video Technology 22(9):1332–1343
Article Google Scholar
Li Y, Tian Y, Duan L-Y, Yang J, Huang T, Gao W (2010) Sequence Multi-Labeling: A Unified Video Annotation Scheme With Spatial and Temporal Context. IEEE Trans Multimedia 12(8):814–828
Article Google Scholar
Li A, Yu F, Shi K (2011) A novel fast and effective video retrieval system for surveillance application. In: IEEE international conference on cyber technology in automation, control, and intelligent systems (CYBER), 20–23 Mar 2011, pp 153–157. https://doi.org/10.1109/CYBER.2011.6011783
Liu H, Meng W (2012) Key frame extraction of online video based on optimized frame difference. In: 9th international conference on fuzzy systems and knowledge discovery (FSKD 2012), 29–31 Mar 2012, pp 1238–1242. https://doi.org/10.1109/FSKD.2012.6233777
Liu Y, Nie L, Han L, Zhang L, Rosenblum DS (2015) Action2activity: recognizing complex activities from sensor data. In: Proceedings of the international joint conference on artificial intelligence, 25–31 Jul 2015, pp 1617–1623
Liu Y, Nie L, Liu L, Rosenblum DS (2016) From action to activity: Sensor-based activity recognition. Neurocomputing, Elsevier 181:108–115
Article Google Scholar
Moxley E, Mei T (2010) Video annotation through search and graph reinforcement mining. In: Manjunath BS (ed). IEEE Trans Multimedia 12(3):184–193. https://doi.org/10.1109/TMM.2010.2041101
Qi GJ, Hua XS, Rui Y, Tang J, Mei T, Zhang HJ (2007) Correlative multi-label video annotation. In: Proc. ACM multimedia, 25–29 Sep 2017, pp 17–26. https://doi.org/10.1145/1291233.1291245
Rathod GI, Nikam DA (2013) An algorithm for shot boundary detection and keyframe extraction using histogram diffference. International Journal of Emerging Technology and Advanced Engineering 2013(8):155–163
Sayar A, Yarman Vural FT (2009) Image annotation with semi-supervised clustering. In: Computer and information sciences, 14–16 Sep 2009. ISCIS 24th international symposium. IEEE, pp 12–17. https://doi.org/10.1109/ISCIS.2009.5291929
Swain MJ, Ballard DH (1991) Color Indexing. J Comput Vis 7(1):11–32
Article Google Scholar
Tang J, Hua X-S, Wang M, Zhiwei G, Qi G-J, Xiuqing W (2009) Correlative Linear Neighborhood Propagation for Video Annotation. IEEE Transactions On Systems, Man, And Cybernetics—Part B: Cybernetics 39(2):409–416
Article Google Scholar
Thakar VB, Hadia SK (2013) An adaptive novel feature based approach for automatic video shot boundary detection. In: International conference on intelligent systems and signal processing (ISSP), 1–2 Mar 2013. IEEE, pp 145–149. https://doi.org/10.1109/ISSP.2013.6526891
Wang M, Hua X-S, Tang J, Hong R (2009a) Beyond Distance Measurement: Constructing Neighborhood Similarity for Video Annotation. IEEE Trans Multimedia 11(3):465–476
Article Google Scholar
Wang M, Hua X-S, Hong R, Tang J, Qi G-J, Song Y (2009b) Unified Video Annotation via Multigraph Learning. IEEE Transactions On Circuits And Systems For Video Technology 19(5):733–746
Article Google Scholar
Xu CS, Wang JJ, Lu HQ, Zhang YF (2008) A novel framework for semantic annotation and personalized retrieval of sports video. IEEE Trans Multimedia 10(3):421–436
Article Google Scholar
Xu S, Tang S, Zhang Y, Li J, Zheng Y-T (2012) Exploring multi-modality structure for cross domian adaptation in video concept annotation. Journal of Neurocomputing 95:11–21. https://doi.org/10.1016/j.neucom.2011.05.041
Zhang H, Kankanhalli A, Smoliar SW (1993) Multimedia systems 1:10. https://doi.org/10.1007/BF01210504
Article Google Scholar
Zhang S, Huang J, Huang Y, Yu Y, Li H, Metaxas DN (2012) Automatic Image Annotation Using Group Sparsity. IEEE Trans Syst Man Cybern B Cybern 42(3):838–849
Article Google Scholar
Yu Qiu (2010) Improving News Video Annotation with Semantic Context, International Conference on Digital Image Computing: Techniques and Applications, DICTA 2010, Sydney, Australia. https://doi.org/10.1109/DICTA.2010.47
Golnaz Abdollahian, Murat Birinci Fernando Diaz-de-Maria , Moncef Gabbouj, Edward J. Delp (2011a) A region-dependent image matching method for image and video annotation. Content-Based Multimedia Indexing (CBMI), 9th International Workshop, ISSN: 1949-3983, pp. 121–126
Guo-Jun Qi, Yan Song, Xian-Sheng Hua, Li-Rong Dai, Hong-Jiang Zhang (2006) Video Annotation by Active Learning and Cluster Tuning, in International. Workshop on Semantic Learning Applications in Multimedia (SLAM 2006). In association with CVPR
Waqas Sultan (2016) What if we do not have multiple videos of the same action? - Video Action Localization Using Web Images, IEEE Intl. Conf. on Computer Vision and Pattern Recognition (CVPR)
Fereshteh FC, Affendy LS, Khalid NMF (2015) Automatic Video Annotation Framework Using Concept Detectors. J Appl Sci. https://doi.org/10.3923/jas.2015.256.263
Guojing Xuan (2013) A Video Annotation Method Based on Color Statistics, International Conference on Computer Sciences and Applications. doi: https://doi.org/10.1109/CSA.2013.151

Download references

Author information

Authors and Affiliations

Department of CSE, Shri Ramdeobaba College of Engineering and Management, Nagpur, India
Shailendra S. Aote
Department of IT, Priyadarshini Institute of Engineering & Technology, Nagpur, India
Archana Potnurwar

Authors

Shailendra S. Aote
View author publications
You can also search for this author inPubMed Google Scholar
Archana Potnurwar
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Shailendra S. Aote.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Aote, S.S., Potnurwar, A. An automatic video annotation framework based on two level keyframe extraction mechanism. Multimed Tools Appl 78, 14465–14484 (2019). https://doi.org/10.1007/s11042-018-6826-3

Download citation

Received: 25 May 2018
Revised: 22 October 2018
Accepted: 24 October 2018
Published: 07 November 2018
Issue Date: 15 June 2019
DOI: https://doi.org/10.1007/s11042-018-6826-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An automatic video annotation framework based on two level keyframe extraction mechanism

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Keyframe extraction using Pearson correlation coefficient and color moments

Fusion of Zero-Normalized Pixel Correlation Coefficient and Higher-Order Color Moments for Keyframe Extraction

An efficient method for video shot boundary detection and keyframe extraction using SIFT-point distribution histogram

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now