Loading [a11y]/accessibility-menu.js
From Text to Video: Exploiting Mid-Level Semantics for Large-Scale Video Classification | IEEE Conference Publication | IEEE Xplore

From Text to Video: Exploiting Mid-Level Semantics for Large-Scale Video Classification


Abstract:

Automatically classifying large scale of video data is an urgent yet challenging task. To bridge the semantic gap between low-level features and high-level video semantic...Show More

Abstract:

Automatically classifying large scale of video data is an urgent yet challenging task. To bridge the semantic gap between low-level features and high-level video semantics, we propose a method to represent videos with their mid-level semantics. Inspired by the problem of text classification, we regard the visual objects in videos as the words in documents, and adapt the TF-IDF word weighting method to encode videos by visual objects. Some extensions upon the proposed method are also made according to the characteristics of videos. We integrate the proposed semantic encoding method with the popular two-stream CNN model for video classification. Experiments are conducted on two large-scale video datasets, CCV and ActivityNet. The experimanetal results validates the effectiveness of our method.
Date of Conference: 20-24 August 2018
Date Added to IEEE Xplore: 29 November 2018
ISBN Information:
Print on Demand(PoD) ISSN: 1051-4651
Conference Location: Beijing, China

References

References is not available for this document.