Towards coherent natural language description of video streams | IEEE Conference Publication | IEEE Xplore