Skip to main content
Log in

Wildlife video key-frame extraction based on novelty detection in semantic context

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

There is a growing evidence that visual saliency can be better modeled using top-down mechanisms that incorporate object semantics. This suggests a new direction for image and video analysis, where semantics extraction can be effectively utilized to improve video summarization, indexing and retrieval. This paper presents a framework that models semantic contexts for key-frame extraction. Semantic context of video frames is extracted and its sequential changes are monitored so that significant novelties are located using a one-class classifier. Working with wildlife video frames, the framework undergoes image segmentation, feature extraction and matching of image blocks, and then a co-occurrence matrix of semantic labels is constructed to represent the semantic context within the scene. Experiments show that our approach using high-level semantic modeling achieves better key-frame extraction as compared with its counterparts using low-level features.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. http://www.youtube.com/watch?v=UGDEnpCgGOI

  2. http://www.youtube.com/watch?v=zIBX6GunjIE

  3. http://www.youtube.com/watch?v=O4Q9_DmElbI&NR=1&feature=fvwp

References

  1. Benjamas N, Cooharojananone N, Jaruskulchai C (2005) Flashlight and player detection in fighting sport for video summarization. In: Proceedings of the IEEE international symposium on communications and information technology, vol 1, pp 441–444. doi:10.1109/ISCIT.2005.1566888

  2. Chatzigiorgaki M, Skodras AN (2009) Real-time keyframe extraction towards video content identification. In: DSP’09: proceedings of the 16th international conference on Digital Signal Processing. IEEE Press, Piscataway, pp 934–939

    Google Scholar 

  3. Deng Y, Manjunath B (2001) Unsupervised segmentation of color-texture regions in images and video. IEEE Trans Pattern Anal Mach Intell 23(8):800–810

    Article  Google Scholar 

  4. Ekin A, Tekalp A, Mehrotra R (2003) Automatic soccer video analysis and summarization. IEEE Trans Image Process 12(7):796–807. doi:10.1109/TIP.2003.812758

    Article  Google Scholar 

  5. Gibson D, Campbell N, Thomas B (2002) Visual abstraction of wildlife footage using Gaussian mixture models and the minimum description length criterion. In: 16th international conference on pattern recognition, vol 2, pp 814–817. doi:10.1109/ICPR.2002.1048427

  6. Haering N, Qian R, Sezan M (2000) A semantic event-detection approach and its application to detecting hunts in wildlife video. IEEE Trans Circuits Syst Video Technol 10(6):857–868. doi:10.1109/76.867923

    Article  Google Scholar 

  7. Haralick RM, Shanmugam K, Dinstein I (1973) Textural features for image classification. IEEE Trans Syst Man Cybern 3:610–621

    Article  Google Scholar 

  8. Jing F, Li M, Zhang L, Zhang HJ, Zhang B (2003) Learning in region-based image retrieval. In: CIVR’03: proceedings of the 2nd international Conference on Image and Video Retrieval. Springer-Verlag, Berlin, pp 206–215

    Chapter  Google Scholar 

  9. Ju SX, Black MJ, Minneman S, Kimber D (1998) Summarization of videotaped presentations: Automatic analysis of motion and gesture. IEEE Trans Circuits Syst Video Technol 8:686–696

    Article  Google Scholar 

  10. Lew MS, Sebe N, Djeraba C, Jain R (2006) Content-based multimedia information retrieval: State of the art and challenges. ACM Trans Multimedia Comput Commun Appl 2:1–19. doi:10.1145/1126004.1126005

    Article  Google Scholar 

  11. Li Z, Schuster GM, Katsaggelos AK (2005) MINMAX optimal video summarization. IEEE Trans Circuits Syst Video Technol 15:1245–1256

    Article  Google Scholar 

  12. Liu G, Zhao J (2009) Key frame extraction from MPEG video stream. In: The 2nd symposium International Computer Science and Computational Technology (ISCSCT ’09)

  13. Liu T, Zhang HJ, Qi F (2003) A novel video key-frame-extraction algorithm based on perceived motion energy model. IEEE Trans Circuits Syst Video Technol 13(10):1006–1013. doi:10.1109/TCSVT.2003.816521

    Article  Google Scholar 

  14. Ma YF, Zhang HJ (2003) Contrast-based image attention analysis by using fuzzy growing. In: Multimedia’03: proceedings of the 11th ACM international conference on multimedia. ACM, New York, pp 374–381. doi:10.1145/957013.957094

    Google Scholar 

  15. Manjunath BS, Ohm JR, Vasudevan VV, Yamada A (2001) Color and texture descriptors. IEEE Trans Circuits Syst Video Technol 11(6):703–715. doi:10.1109/76.927424

    Article  Google Scholar 

  16. Money AG, Agius H (2008) Video summarisation: a conceptual framework and survey of the state of the art. J Vis Commun Image Represent 19(2):121–143. doi:10.1016/j.jvcir.2007.04.002

    Article  Google Scholar 

  17. Mukherjee D, Das S, Saha S (2007) Key frame estimation in video using randomness measure of feature point pattern. IEEE Trans Circuits Syst Video Technol 17(5):612–620. doi:10.1109/TCSVT.2007.895353

    Article  Google Scholar 

  18. Narasimha R, Savakis A, Rao R, Queiroz RD (2003) Key frame extraction using mpeg-7 motion descriptors. In: Proceedings of the asilomar conference on signals, systems, and computers, pp 1575–1579

  19. Panagiotakis C, Doulamis A, Tziritas G (2009) Equivalent key frames selection based on iso-content principles. IEEE Trans Circuits Syst Video Technol 19(3):447–451

    Article  Google Scholar 

  20. Park DS, Park JS, Kim TY, Han JH (1999) Image indexing using weighted color histogram. International conference on image analysis and processing, pp 909–914. doi:http://doi.ieeecomputersociety.org/10.1109/ICIAP.1999.797711

  21. Qing L, Wang W, Huang T, Gao W (2002) A framework for background detection in video. In: Chen YC, Chang LW, Hsu CT (eds) Advances in multimedia information processing—PCM 2002. Lecture notes in computer science, vol 2532. Springer, Berlin/Heidelberg, pp 39–48

    Google Scholar 

  22. Shih HC, Huang CL (2005) MSN: statistical understanding of broadcasted baseball video using multi-level semantic network. IEEE Trans Broadcast 51(4):449–459. doi:10.1109/TBC.2005.854169

    Article  Google Scholar 

  23. Spyrou E, Avrithis Y (2007) Keyframe extraction using local visual semantics in the form of a region thesaurus. International workshop on semantic media adaptation and personalization, pp 98–103. doi:http://doi.ieeecomputersociety.org/10.1109/SMAP.2007.39

  24. Spyrou E, Tolias G, Mylonas P, Avrithis Y (2009) Concept detection and keyframe extraction using a visual thesaurus. Multimed Tools Appl. 41:337–373. doi:10.1007/s11042-008-0237-9

    Article  Google Scholar 

  25. Stirk JA, Underwood G (2007) Low-level visual saliency does not predict change detection in natural scences. J Vis 7(10):3, 1–10

    Article  Google Scholar 

  26. Torralba A, Castelhano MS, Oliva A, Henderson JM (2006) Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. Psychol Rev 113:766–786

    Article  Google Scholar 

  27. Truong BT, Venkatesh S (2007) Video abstraction: a systematic review and classification. ACM Trans. Multimedia Comput Commun Appl 3(1):3. doi:10.1145/1198302.1198305

    Article  Google Scholar 

  28. Yong S, Deng J, Purvis M (2010) Modelling semantic context for novelty detection in wildlefe scenes. In: 2010 IEEE international conference on multimedia and expo, pp 1254–1259

  29. Zeng X, Weiming H, Li W, Zhang X, Xu B (2008) Key-frame extraction using dominant-set clustering. In: 2008 IEEE international conference on multimedia and expo, pp 1285–1288. doi:10.1109/ICME.2008.4607677

  30. Zhuang Y, Rui Y, Huang T, Mehrotra S (1998) Adaptive key frame extraction using unsupervised clustering. In: Proceedings International Conference on Image Processing, 1998, ICIP 98, vol 1, pp 866–870. doi:10.1109/ICIP.1998.723655

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jeremiah D. Deng.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yong, SP., Deng, J.D. & Purvis, M.K. Wildlife video key-frame extraction based on novelty detection in semantic context. Multimed Tools Appl 62, 359–376 (2013). https://doi.org/10.1007/s11042-011-0902-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-011-0902-2

Keywords

Navigation