skip to main content
10.1145/3123266.3130141acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
tutorial

Deep Learning for Intelligent Video Analysis

Published:23 October 2017Publication History

ABSTRACT

Analyzing videos is one of the fundamental problems of computer vision and multimedia content analysis for decades. The task is very challenging as video is an information-intensive media with large variations and complexities. Thanks to the recent development of deep learning techniques, researchers in both computer vision and multimedia communities are now able to boost the performance of video analysis significantly and initiate new research directions to analyze video content. This tutorial will present recent advances under the umbrella of video understanding, which start from a unified deep learning toolkit--Microsoft Cognitive Toolkit (CNTK) that supports popular model types such as convolutional nets and recurrent networks, to fundamental challenges of video representation learning and video classification, recognition, and finally to an emerging area of video and language.

References

  1. Qing Li, Zhaofan Qiu, Ting Yao, Tao Mei, Yong Rui, and Jiebo Luo. 2016. Action Recognition by Learning Deep Multi-Granular Spatio-Temporal Video Representation. In ICMR. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Qing Li, Zhaofan Qiu, Ting Yao, Tao Mei, Yong Rui, and Jiebo Luo. 2017. Learning hierarchical video representation for action recognition. International Journal of Multimedia Information Retrieval (2017), 1--14.Google ScholarGoogle ScholarCross RefCross Ref
  3. Yehao Li, Ting Yao, Tao Mei, Hongyang Chao, and Yong Rui. 2016. Share-and-Chat: Achieving Human-Level Video Commenting by Search and Multi-View Embedding. In ACM MM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Yingwei Pan, Yehao Li, Ting Yao, Tao Mei, Houqiang Li, and Yong Rui. 2016. Learning deep intrinsic video representation by exploring temporal coherence and graph structure. In IJCAI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Yingwei Pan, Tao Mei, Ting Yao, Houqiang Li, and Yong Rui. 2016. Jointly modeling embedding and translation to bridge video and language. In CVPR.Google ScholarGoogle Scholar
  6. Yingwei Pan, Ting Yao, Houqiang Li, and Tao Mei. 2017. Video Captioning with Transferred Semantic Attributes. CVPR (2017).Google ScholarGoogle Scholar
  7. Zhaofan Qiu, Ting Yao, and Tao Mei. 2017. Deep Quantization: Encoding Convolutional Activations with Deep Generative Model. In CVPR.Google ScholarGoogle Scholar
  8. Zhaofan Qiu, Ting Yao, and Tao Mei. 2017. Learning Spatio- Temporal Representation with Pseudo-3D Residual Networks. In ICCV.Google ScholarGoogle Scholar
  9. Jun Xu, Tao Mei, Ting Yao, and Yong Rui. 2016. Msr-vtt: A large video description dataset for bridging video and language. In CVPR.Google ScholarGoogle Scholar
  10. Ting Yao, Yingwei Pan, Yehao Li, and Tao Mei. 2017. Incorporating Copying Mechanism in Image Captioning for Learning Novel Objects. In CVPR.Google ScholarGoogle Scholar
  11. Ting Yao, Yingwei Pan, Yehao Li, Zhaofan Qiu, and Tao Mei. 2017. Boosting Image Captioning with Attributes. In ICCV.Google ScholarGoogle Scholar

Index Terms

  1. Deep Learning for Intelligent Video Analysis

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        MM '17: Proceedings of the 25th ACM international conference on Multimedia
        October 2017
        2028 pages
        ISBN:9781450349062
        DOI:10.1145/3123266

        Copyright © 2017 Owner/Author

        Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 23 October 2017

        Check for updates

        Qualifiers

        • tutorial

        Acceptance Rates

        MM '17 Paper Acceptance Rate189of684submissions,28%Overall Acceptance Rate995of4,171submissions,24%

        Upcoming Conference

        MM '24
        MM '24: The 32nd ACM International Conference on Multimedia
        October 28 - November 1, 2024
        Melbourne , VIC , Australia

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader