Stacked Multimodal Attention Network for Context-Aware Video Captioning | IEEE Journals & Magazine | IEEE Xplore