skip to main content
10.1145/3019612.3019641acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

Convolutions through time for multi-label movie genre classification

Authors Info & Claims
Published:03 April 2017Publication History

ABSTRACT

In this paper, we explore the suitability of employing Convolutional Neural Networks (ConvNets) for multi-label movie trailer genre classification. Assigning genres to movies is a particularly challenging task because genre is an immaterial feature that is not physically present in a movie frame, so off-the-shelf image detection models cannot be easily adapted to this context. Moreover, multi-label classification is more challenging than single-label classification considering that one instance can be assigned to multiple classes at once. We propose a novel classification method that encapsulates an ultra-deep ConvNet with residual connections. Our approach extracts temporal information from image-based features prior to performing the mapping of trailers to genres. We compare our novel approach with the current state-of-the-art techniques for movie classification, which make use of well-known image descriptors and low-level handcrafted features. Results show that our method significantly outperforms the state-of-the-art in this task, improving the classification accuracy for all genres.

References

  1. D. Comaniciu and P. Meer. Mean shift: a robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(5):603--619, May 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. I. J. Goodfellow, D. Warde-Farley, M. Mirza, A. C. Courville, and Y. Bengio. Maxout networks. Journal of Machine Learning Research, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. arXiv preprint arXiv:1512.03385, 2015.Google ScholarGoogle Scholar
  4. Y.-F. Huang and S.-H. Wang. Movie genre classification using svm with audio and video features. In R. Huang, A. A. Ghorbani, G. Pasi, T. Yamaguchi, N. Y. Yen, and B. Jin, editors, AMT, volume 7669 of Lecture Notes in Computer Science, pages 1--10. Springer, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. Ji, W. Xu, M. Yang, and K. Yu. 3d convolutional neural networks for human action recognition. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 35(1):221--231, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, and L. Fei-Fei. Large-scale video classification with convolutional neural networks. In IEEE Conference on Computer Vision and Pattern Recognition, pages 1725--1732. IEEE, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Y. Kim. Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882, 2014.Google ScholarGoogle Scholar
  8. D. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.Google ScholarGoogle Scholar
  9. A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097--1105, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278--2324, 1998. Google ScholarGoogle ScholarCross RefCross Ref
  11. D. McEnnis, C. McKay, I. Fujinaga, and P. Depalle. jaudio: An feature extraction library. In ISMIR, pages 600--603, 2005.Google ScholarGoogle Scholar
  12. A. Oliva and A. Torralba. Modeling the shape of the scene: A holistic representation of the spatial envelope. International journal of computer vision, 42(3):145--175, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Z. Rasheed, Y. Sheikh, and M. Shah. On the use of computable features for film classification. Circuits and Systems for Video Technology, IEEE Transactions on, 15(1):52--64, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. G. Simões, J. Wehrmann, R. C. Barros, and D. D. Ruiz. Movie genre classification with convolutional neural networks. In International Joint Conference on Neural Networks. IEEE, 2016. Google ScholarGoogle ScholarCross RefCross Ref
  15. K. Soomro, A. R. Zamir, and M. Shah. Ucf101: A dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402, 2012.Google ScholarGoogle Scholar
  16. C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. arXiv preprint arXiv:1409.4842, 2014.Google ScholarGoogle Scholar
  17. D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri. Learning spatiotemporal features with 3d convolutional networks. In 2015 IEEE International Conference on Computer Vision (ICCV), pages 4489--4497. IEEE, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Wehrmann, R. C. Barros, G. Simões, T. S. Paula, and D. D. Ruiz. (deep) learning from frames. In Brazilian Conference on Intelligent Systems, 2016. Google ScholarGoogle ScholarCross RefCross Ref
  19. J. Wu and J. M. Rehg. Where am i: Place instance and category recognition using spatial pact. In CVPR, pages 1--8, 2008.Google ScholarGoogle Scholar
  20. B. Zhou, A. Lapedriza, J. Xiao, A. Torralba, and A. Oliva. Learning deep features for scene recognition using places database. In Advances in neural information processing systems, pages 487--495, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. H. Zhou, T. Hermans, A. V. Karandikar, and J. M. Rehg. Movie genre classification via scene categorization. In Proceedings of the international conference on Multimedia, pages 747--750. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Convolutions through time for multi-label movie genre classification

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SAC '17: Proceedings of the Symposium on Applied Computing
        April 2017
        2004 pages
        ISBN:9781450344869
        DOI:10.1145/3019612

        Copyright © 2017 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 3 April 2017

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate1,650of6,669submissions,25%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader