Scene Extraction for Video Clips Based on the Relation of Text, Pointing Region and Temporal Duration of User Comments | IEEE Conference Publication | IEEE Xplore