Interactive video segmentation aims to segment objects from videos using user information of object location. It allows for segmenting different objects from the same scene and has many applications, such as video editing and scene understanding. While in automatic video segmentation, the major challenges are temporal coherency and occlusion, interactive segmentation models must also handle unseen objects. This work proposes an interactive video segmentation strategy based on seed competition and user-drawn scribbles. Our proposal starts with a seed oversampling strategy and iteratively computes the optimum path forest for the seed set, maintaining the most relevant trees. Our Interactive Video Segmentation by Dynamic and Iterative Spanning Forest (iVSDISF) extends the Interactive Dynamic and Iterative Spanning Forest for videos, avoiding object leakage by dynamically creating trees at critical image positions. The proposed method is highly competitive with the state-of-the-art achieving the second highest score, in terms of IoU, considering all studied methods, and the best IoU among the ones without optical flow computation for SegTrackv2.
The authors thank the Pontifícia Universidade Católica de Minas Gerais – PUC-Minas, Coordenação de Aperfeiçoamento de Pessoal de Nível Superior – CAPES – (Grant PROAP 88887.842889/2023-00 – PUC/MG, Grant STIC-AMSUD 88887.878869/2023-00, Grant PDPG 88887.708960/2022-00 – PUC/MG - Informática, and Finance Code 001), the Conselho Nacional de Desenvolvimento Científico e Tecnológico – CNPq (Grants 407242/2021-0, 306573/2022-9, 442950/2023-3 and 304711/2023-3), Fundação de Apoio à Pesquisa do Estado de Minas Gerais – FAPEMIG (Grant APQ-01079-23, Grant APQ-05058-23 and PCE-00417-24) and Fundação de Apoio à Pesquisa do Estado de São Paulo – FAPESP (Grant 2023/14427-8).
