Video Question Generation via Semantic Rich Cross-Modal Self-Attention Networks Learning | IEEE Conference Publication | IEEE Xplore