Journals & Magazines >IEEE Transactions on Cybernet... >Volume: 49 Issue: 7

Describing Video With Attention-Based Bidirectional LSTM

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Video captioning has been attracting broad research attention in the multimedia community. However, most existing approaches heavily rely on static visual information or ...Show More

Metadata

Abstract:

Video captioning has been attracting broad research attention in the multimedia community. However, most existing approaches heavily rely on static visual information or partially capture the local temporal knowledge (e.g., within 16 frames), thus hardly describing motions accurately from a global view. In this paper, we propose a novel video captioning framework, which integrates bidirectional long-short term memory (BiLSTM) and a soft attention mechanism to generate better global representations for videos as well as enhance the recognition of lasting motions in videos. To generate video captions, we exploit another long-short term memory as a decoder to fully explore global contextual information. The benefits of our proposed method are two fold: 1) the BiLSTM structure comprehensively preserves global temporal and visual information and 2) the soft attention mechanism enables a language decoder to recognize and focus on principle targets from the complex content. We verify the effectiveness of our proposed video captioning framework on two widely used benchmarks, that is, microsoft video description corpus and MSR-video to text, and the experimental results demonstrate the superiority of the proposed approach compared to several state-of-the-art methods.

Published in: IEEE Transactions on Cybernetics ( Volume: 49, Issue: 7, July 2019)

Page(s): 2631 - 2641

Date of Publication: 25 May 2018

ISSN Information:

PubMed ID: 29993730

DOI: 10.1109/TCYB.2018.2831447

Funding Agency:

Contents

References is not available for this document.

Describing Video With Attention-Based Bidirectional LSTM

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Describing Video With Attention-Based Bidirectional LSTM

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?