WINNER: Weakly-supervised hIerarchical decompositioN and aligNment for spatio-tEmporal video gRounding | IEEE Conference Publication | IEEE Xplore