HTML: Hybrid Temporal-scale Multimodal Learning Framework for Referring Video Object Segmentation | IEEE Conference Publication | IEEE Xplore