Skip to main content
Log in

Movie Scene Recognition Using Panoramic Frame and Representative Feature Patches

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Recognizing scene information in images or videos, such as locating the objects and answering “Where am I?”, has attracted much attention in computer vision research field. Many existing scene recognition methods focus on static images, and cannot achieve satisfactory results on videos which contain more complex scenes features than images. In this paper, we propose a robust movie scene recognition approach based on panoramic frame and representative feature patch. More specifically, the movie is first efficiently segmented into video shots and scenes. Secondly, we introduce a novel key-frame extraction method using panoramic frame and also a local feature extraction process is applied to get the representative feature patches (RFPs) in each video shot. Thirdly, a Latent Dirichlet Allocation (LDA) based recognition model is trained to recognize the scene within each individual video scene clip. The correlations between video clips are considered to enhance the recognition performance. When our proposed approach is implemented to recognize the scene in realistic movies, the experimental results shows that it can achieve satisfactory performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Gupta L, Pathangay V, Dyana A, Das S. Indoor versus outdoor scene classification using probabilistic neural network. EURASIP Journal on Applied Signal Processing, 2007, 2007(1).

  2. Li F F, Perona P. A Bayesian hierarchical model for learning natural scene categories. In Proc. the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Jun. 2005, pp. 524-531.

  3. Wu J X, Rehg J M. CENTRIST: A visual descriptor for scene categorization. IEEE Trans. Pattern Analysis and Machine Intelligence, 2011, 33(8): 1489-1501.

    Article  Google Scholar 

  4. Liu J, Shah M. Scene modeling using co-clustering. In Proc. the 11th Int. Conf. Computer Vision, Oct. 2007.

  5. Wu J X, Rehg J M. Where am I: Place instance and category recognition using spatial PACT. In Proc. the 2008 IEEE Conference on Computer Vision and Pattern Recognition, Jun. 2008.

  6. Lazebnik S, Schmid C, Ponce J. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In Proc. the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Jun. 2006, pp. 2169-2178.

  7. Marszalek M, Laptev I, Schmid C. Actions in context. In Proc. the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Jun. 2009, pp. 2929-2936.

  8. Engels C, Deschacht K, Becker J H et al. Automatic annotation of unique locations from video and text. In Proc. the 21st British Machine Vision Conference, Aug. 31 - Sept. 3, 2010.

  9. Zhou X, Zhuang X D, Tang H et al. A novel Gaussianized vector representation for natural scene categorization. In Proc. the 19th Int. Conf. Pattern Recognition, Dec. 2008, pp. 1-4.

  10. Greene M R, Oliva A. Recognition of natural scenes from global properties: Seeing the forest without representing the trees. Cognitive Psychology, 2009, 58(2): 137-176.

    Article  Google Scholar 

  11. Xiao J X, Hays J, Ehinger K A et al. SUN database: Large-scale scene recognition from abbey to zoo. In Proc. the 23rd IEEE Conference on Computer Vision and Pattern Recognition, Jun. 2010, pp. 3485-3492.

  12. Ando R, Shinoda K, Mochizuki T. A robust scene recognition system for baseball broadcast using data-driven approach. In Proc. the 6th ACM Int. Conf. Image and Video Retrieval, Jul. 2007, pp. 186-193.

  13. Huang J C, Liu Z, Wang Y. Joint scene classification and segmentation based on hidden Markov model. IEEE Trans. Multimedia, 2005, 7(3): 538-550.

    Article  Google Scholar 

  14. Schaffalitzky F, Zisserman A. Automated location matching in movies. Computer Vision and Image Understanding, 2003, 92(2/3): 236-264.

    Article  Google Scholar 

  15. Héritier M, Gagnon L, Foucher S. Places clustering of full-length film key-frames using latent aspect modeling over SIFT matches. IEEE Trans. Circuits and Systems for Video Technology, 2009, 19(6): 832-841.

    Article  Google Scholar 

  16. Héritier M, Foucher S, Gagnon L. Key-places detection and clustering in movies using latent aspects. In Proc. the 14th IEEE Int. Conf. Image Processing, Sept. 16 - Oct. 19, 2007, pp. 225-228.

  17. Bosch A, Zisserman A, Muãoz X. Scene classification via pLSA. In Proc. the 9th European Conference on Computer Vision, May 2006, pp. 517-530.

  18. Ni K, Kannan A, Criminisi A, Win J. Epitomic location recognition. IEEE Trans. Pattern Analysis and Machine Intelligence, 2009, 31(12): 2158-2167.

    Article  Google Scholar 

  19. Gao G Y, Ma H D. Accelerating shot boundary detection by reducing spatial and temporal redundant information. In Proc. the 2011 IEEE Int. Conf. Multimedia and Expo, Jul. 2011, pp. 1-6.

  20. Gao G Y, Ma H D. Multi-modality movie scene detection using kernel canonical correlation analysis. In Proc. the 21st Int. Conf. Pattern Recogntion, Nov. 2012, pp. 3074-3077.

  21. Zeng X L, Hu W M, Liy W et al. Key-frame extraction using dominant-set clustering. In Proc. the 2008 IEEE Int. Conf. Multimedia and Expo, Jun. 2008, pp. 1285-1288.

  22. Rasheed Z, Shah M. Detection and representation of scenes in videos. IEEE Trans. Multimedia, 2005, 7(6): 1097-1105.

    Article  Google Scholar 

  23. Xiao J X, Ehinger K A, Oliva A, Torralba A. Recognizing scene viewpoint using panoramic place representation. In Proc. the 2012 IEEE Conf. Computer Vision and Pattern Recognition, June 2012, pp. 2695-2702.

  24. Ghanem B, Zhang T Z, Ahuja N. Robust video registration applied to field-sports video analysis. In Proc. the 2012 IEEE Int. Conf. Acoustics, Speech and Signal Processing, March 2012.

  25. Lowe D G. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 2004, 60(2): 91-110.

    Article  Google Scholar 

  26. Matas J, Chum O, Urban M, Pajdla T. Robust wide-baseline stereo from maximally stable extremal regions. Image and Vision Computing, 2004, 22(10): 761-767.

    Article  Google Scholar 

  27. Mikolajczyk K, Schmid C. Scale & affine invariant interest point detectors. International Journal of Computer Vision, 2004, 60(1): 63-86.

    Article  Google Scholar 

  28. Bourdev L, Malik J, Poselets: Body-part detectors trained using 3D pose annotations. In Proc. the 12th Int. Conf. Computer Vision, Sept. 29 - Oct. 2, 2009, pp. 1365-1372.

  29. Zhao W L, Ngo C, Tan H K, Wu X. Near duplicate keyframe identification with interest point matching and pattern learning. IEEE Trans. Multimedia, 2007, 9(5): 1037-1048.

    Article  Google Scholar 

  30. Fellbaum C (editor). Wordnet: An Electronic Lexical Database. MIT Press, 1998.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hua-Dong Ma.

Additional information

The research is supported by the National Funds for Distinguished Young Scientists of China under Grant No. 60925010, the Specialized Research Fund for the Doctoral Program of Higher Education of China under Grant No. 20120005130002, the Cosponsored Project of Beijing Committee of Education, the Funds for Creative Research Groups of China under Grant No. 61121001, and the Program for Changjiang Scholars and Innovative Research Team in University of China under Grant No. IRT1049.

Electronic supplementary material

Below is the link to the electronic supplementary material.

ESM 1

(DOC 28 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gao, GY., Ma, HD. Movie Scene Recognition Using Panoramic Frame and Representative Feature Patches. J. Comput. Sci. Technol. 29, 155–164 (2014). https://doi.org/10.1007/s11390-014-1418-9

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-014-1418-9

Keywords

Navigation