Movie Scene Recognition Using Panoramic Frame and Representative Feature Patches

Gao, Guang-Yu; Ma, Hua-Dong

doi:10.1007/s11390-014-1418-9

Movie Scene Recognition Using Panoramic Frame and Representative Feature Patches

Regular Paper
Published: 10 January 2014

Volume 29, pages 155–164, (2014)
Cite this article

Journal of Computer Science and Technology Aims and scope Submit manuscript

Guang-Yu Gao^1,2 &
Hua-Dong Ma¹

163 Accesses
2 Citations
Explore all metrics

Abstract

Recognizing scene information in images or videos, such as locating the objects and answering “Where am I?”, has attracted much attention in computer vision research field. Many existing scene recognition methods focus on static images, and cannot achieve satisfactory results on videos which contain more complex scenes features than images. In this paper, we propose a robust movie scene recognition approach based on panoramic frame and representative feature patch. More specifically, the movie is first efficiently segmented into video shots and scenes. Secondly, we introduce a novel key-frame extraction method using panoramic frame and also a local feature extraction process is applied to get the representative feature patches (RFPs) in each video shot. Thirdly, a Latent Dirichlet Allocation (LDA) based recognition model is trained to recognize the scene within each individual video scene clip. The correlations between video clips are considered to enhance the recognition performance. When our proposed approach is implemented to recognize the scene in realistic movies, the experimental results shows that it can achieve satisfactory performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning topic of dynamic scene using belief propagation and weighted visual words approach

Article 05 August 2014

K-centered Patch Sampling for Efficient Video Recognition

From local to global key-frame extraction based on important scenes using SVD of centrist features

Article 23 June 2018

References

Gupta L, Pathangay V, Dyana A, Das S. Indoor versus outdoor scene classification using probabilistic neural network. EURASIP Journal on Applied Signal Processing, 2007, 2007(1).
Li F F, Perona P. A Bayesian hierarchical model for learning natural scene categories. In Proc. the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Jun. 2005, pp. 524-531.
Wu J X, Rehg J M. CENTRIST: A visual descriptor for scene categorization. IEEE Trans. Pattern Analysis and Machine Intelligence, 2011, 33(8): 1489-1501.
Article Google Scholar
Liu J, Shah M. Scene modeling using co-clustering. In Proc. the 11th Int. Conf. Computer Vision, Oct. 2007.
Wu J X, Rehg J M. Where am I: Place instance and category recognition using spatial PACT. In Proc. the 2008 IEEE Conference on Computer Vision and Pattern Recognition, Jun. 2008.
Lazebnik S, Schmid C, Ponce J. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In Proc. the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Jun. 2006, pp. 2169-2178.
Marszalek M, Laptev I, Schmid C. Actions in context. In Proc. the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Jun. 2009, pp. 2929-2936.
Engels C, Deschacht K, Becker J H et al. Automatic annotation of unique locations from video and text. In Proc. the 21st British Machine Vision Conference, Aug. 31 - Sept. 3, 2010.
Zhou X, Zhuang X D, Tang H et al. A novel Gaussianized vector representation for natural scene categorization. In Proc. the 19th Int. Conf. Pattern Recognition, Dec. 2008, pp. 1-4.
Greene M R, Oliva A. Recognition of natural scenes from global properties: Seeing the forest without representing the trees. Cognitive Psychology, 2009, 58(2): 137-176.
Article Google Scholar
Xiao J X, Hays J, Ehinger K A et al. SUN database: Large-scale scene recognition from abbey to zoo. In Proc. the 23rd IEEE Conference on Computer Vision and Pattern Recognition, Jun. 2010, pp. 3485-3492.
Ando R, Shinoda K, Mochizuki T. A robust scene recognition system for baseball broadcast using data-driven approach. In Proc. the 6th ACM Int. Conf. Image and Video Retrieval, Jul. 2007, pp. 186-193.
Huang J C, Liu Z, Wang Y. Joint scene classification and segmentation based on hidden Markov model. IEEE Trans. Multimedia, 2005, 7(3): 538-550.
Article Google Scholar
Schaffalitzky F, Zisserman A. Automated location matching in movies. Computer Vision and Image Understanding, 2003, 92(2/3): 236-264.
Article Google Scholar
Héritier M, Gagnon L, Foucher S. Places clustering of full-length film key-frames using latent aspect modeling over SIFT matches. IEEE Trans. Circuits and Systems for Video Technology, 2009, 19(6): 832-841.
Article Google Scholar
Héritier M, Foucher S, Gagnon L. Key-places detection and clustering in movies using latent aspects. In Proc. the 14th IEEE Int. Conf. Image Processing, Sept. 16 - Oct. 19, 2007, pp. 225-228.
Bosch A, Zisserman A, Muãoz X. Scene classification via pLSA. In Proc. the 9th European Conference on Computer Vision, May 2006, pp. 517-530.
Ni K, Kannan A, Criminisi A, Win J. Epitomic location recognition. IEEE Trans. Pattern Analysis and Machine Intelligence, 2009, 31(12): 2158-2167.
Article Google Scholar
Gao G Y, Ma H D. Accelerating shot boundary detection by reducing spatial and temporal redundant information. In Proc. the 2011 IEEE Int. Conf. Multimedia and Expo, Jul. 2011, pp. 1-6.
Gao G Y, Ma H D. Multi-modality movie scene detection using kernel canonical correlation analysis. In Proc. the 21st Int. Conf. Pattern Recogntion, Nov. 2012, pp. 3074-3077.
Zeng X L, Hu W M, Liy W et al. Key-frame extraction using dominant-set clustering. In Proc. the 2008 IEEE Int. Conf. Multimedia and Expo, Jun. 2008, pp. 1285-1288.
Rasheed Z, Shah M. Detection and representation of scenes in videos. IEEE Trans. Multimedia, 2005, 7(6): 1097-1105.
Article Google Scholar
Xiao J X, Ehinger K A, Oliva A, Torralba A. Recognizing scene viewpoint using panoramic place representation. In Proc. the 2012 IEEE Conf. Computer Vision and Pattern Recognition, June 2012, pp. 2695-2702.
Ghanem B, Zhang T Z, Ahuja N. Robust video registration applied to field-sports video analysis. In Proc. the 2012 IEEE Int. Conf. Acoustics, Speech and Signal Processing, March 2012.
Lowe D G. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 2004, 60(2): 91-110.
Article Google Scholar
Matas J, Chum O, Urban M, Pajdla T. Robust wide-baseline stereo from maximally stable extremal regions. Image and Vision Computing, 2004, 22(10): 761-767.
Article Google Scholar
Mikolajczyk K, Schmid C. Scale & affine invariant interest point detectors. International Journal of Computer Vision, 2004, 60(1): 63-86.
Article Google Scholar
Bourdev L, Malik J, Poselets: Body-part detectors trained using 3D pose annotations. In Proc. the 12th Int. Conf. Computer Vision, Sept. 29 - Oct. 2, 2009, pp. 1365-1372.
Zhao W L, Ngo C, Tan H K, Wu X. Near duplicate keyframe identification with interest point matching and pattern learning. IEEE Trans. Multimedia, 2007, 9(5): 1037-1048.
Article Google Scholar
Fellbaum C (editor). Wordnet: An Electronic Lexical Database. MIT Press, 1998.

Download references

Author information

Authors and Affiliations

Beijing Key Laboratory of Intelligent Telecommunications Software and Multimedia, Beijing University of Posts and Telecommunications, Beijing, 100876, China
Guang-Yu Gao & Hua-Dong Ma
School of Software, Beijing Institute of Technology, Beijing, 100081, China
Guang-Yu Gao

Authors

Guang-Yu Gao
View author publications
You can also search for this author in PubMed Google Scholar
Hua-Dong Ma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hua-Dong Ma.

Additional information

The research is supported by the National Funds for Distinguished Young Scientists of China under Grant No. 60925010, the Specialized Research Fund for the Doctoral Program of Higher Education of China under Grant No. 20120005130002, the Cosponsored Project of Beijing Committee of Education, the Funds for Creative Research Groups of China under Grant No. 61121001, and the Program for Changjiang Scholars and Innovative Research Team in University of China under Grant No. IRT1049.

Electronic supplementary material

Below is the link to the electronic supplementary material.

ESM 1

(DOC 28 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gao, GY., Ma, HD. Movie Scene Recognition Using Panoramic Frame and Representative Feature Patches. J. Comput. Sci. Technol. 29, 155–164 (2014). https://doi.org/10.1007/s11390-014-1418-9

Download citation

Received: 27 December 2012
Revised: 26 November 2013
Published: 10 January 2014
Issue Date: January 2014
DOI: https://doi.org/10.1007/s11390-014-1418-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Movie Scene Recognition Using Panoramic Frame and Representative Feature Patches

Abstract

Access this article

Similar content being viewed by others

Learning topic of dynamic scene using belief propagation and weighted visual words approach

K-centered Patch Sampling for Efficient Video Recognition

From local to global key-frame extraction based on important scenes using SVD of centrist features

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

ESM 1

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Movie Scene Recognition Using Panoramic Frame and Representative Feature Patches

Abstract

Access this article

Similar content being viewed by others

Learning topic of dynamic scene using belief propagation and weighted visual words approach

K-centered Patch Sampling for Efficient Video Recognition

From local to global key-frame extraction based on important scenes using SVD of centrist features

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

ESM 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation