General and Task-Oriented Video Segmentation

Chen, Mu; Li, Liulei; Wang, Wenguan; Quan, Ruijie; Yang, Yi

doi:10.1007/978-3-031-72667-5_5

Mu Chen¹³,
Liulei Li¹³,
Wenguan Wang¹⁴,
Ruijie Quan¹⁴ &
…
Yi Yang¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15065))

Included in the following conference series:

European Conference on Computer Vision

483 Accesses
2 Citations

Abstract

We present GvSeg, a general video segmentation framework for addressing four different video segmentation tasks (i.e., instance, semantic, panoptic, and exemplar-guided) while maintaining an identical architectural design. Currently, there is a trend towards developing general video segmentation solutions that can be applied across multiple tasks. This streamlines research endeavors and simplifies deployment. However, such a highly homogenized framework in current design, where each element maintains uniformity, could overlook the inherent diversity among different tasks and lead to suboptimal performance. To tackle this, GvSeg: i) provides a holistic disentanglement and modeling for segment targets, thoroughly examining them from the perspective of appearance, position, and shape, and on this basis, ii) reformulates the query initialization, matching and sampling strategies in alignment with the task-specific requirement. These architecture-agnostic innovations empower GvSeg to effectively address each unique task by accommodating the specific properties that characterize them. Extensive experiments on seven gold-standard benchmark datasets demonstrate that GvSeg surpasses all existing specialized/general solutions by a significant margin on four different video segmentation tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Betrayed by Attention: A Simple yet Effective Approach for Self-supervised Video Object Segmentation

Unsupervised video object segmentation with mask transformer: boosting accuracy and efficiency through feature fusion

Article 07 February 2025

Appearance-Based Refinement for Object-Centric Motion Segmentation

References

Zhou, T., Porikli, F., Crandall, D.J., Van Gool, L., Wang, W.: A survey on deep learning technique for video segmentation. IEEE TPAMI 45(6), 7099–7122 (2022)
Article Google Scholar
Yang, L., Fan, Y., Xu, N.: Video instance segmentation. In: ICCV (2019)
Google Scholar
Wang, Y., et al.: End-to-end video instance segmentation with transformers. In: CVPR (2021)
Google Scholar
Huang, D.A., Yu, Z., Anandkumar, A.: MinVIS: a minimal video instance segmentation framework without video-based training. In: NeurIPS (2022)
Google Scholar
Heo, M., Hwang, S., Oh, S.W., Lee, J.Y., Kim, S.J.: VITA: video instance segmentation via object token association. In: NeurIPS (2022)
Google Scholar
Wu, J., Jiang, Y., Bai, S., Zhang, W., Bai, X.: SeqFormer: sequential transformer for video instance segmentation. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13688, pp. 553–569. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19815-1_32
Chapter Google Scholar
Hu, P., Caba, F., Wang, O., Lin, Z., Sclaroff, S., Perazzi, F.: Temporally distributed networks for fast video semantic segmentation. In: CVPR (2020)
Google Scholar
Paul, M., Danelljan, M., Van Gool, L., Timofte, R.: Local memory attention for fast video semantic segmentation. In: IROS (2021)
Google Scholar
Ji, W., et al.: Multispectral video semantic segmentation: a benchmark dataset and baseline. In: CVPR (2023)
Google Scholar
Sun, G., Liu, Y., Ding, H., Probst, T., Van Gool, L.: Coarse-to-fine feature mining for video semantic segmentation. In: CVPR (2022)
Google Scholar
Kim, D., Woo, S., Lee, J.Y., Kweon, I.S.: Video panoptic segmentation. In: CVPR (2020)
Google Scholar
Weber, M., et al.: Step: segmenting and tracking every pixel. In: NeurIPS (2021)
Google Scholar
Woo, S., Kim, D., Lee, J.Y., Kweon, I.S.: Learning to associate every segment for video panoptic segmentation. In: CVPR (2021)
Google Scholar
Liang, C., Wang, W., Zhou, T., Miao, J., Luo, Y., Yang, Y.: Local-global context aware transformer for language-guided video segmentation. IEEE TPAMI 45(8), 10055–10069 (2023)
Article Google Scholar
Hui, T., et al.: Language-aware spatial-temporal collaboration for referring video segmentation. IEEE TPAMI 45(7), 8646–8659 (2023)
Google Scholar
Cheng, Y., et al.: Segment and track anything. arXiv preprint arXiv:2305.06558 (2023)
Wang, W., Shen, J., Li, X., Porikli, F.: Robust video object cosegmentation. IEEE TIP 24(10), 3137–3148 (2015)
MathSciNet Google Scholar
Wang, W., Shen, J., Porikli, F.: Saliency-aware geodesic video object segmentation. In: CVPR (2015)
Google Scholar
Wang, W., Shen, J., Xie, J., Porikli, F.: Super-trajectory for video segmentation. In: ICCV (2017)
Google Scholar
Lu, X., Wang, W., Shen, J., Tai, Y.W., Crandall, D.J., Hoi, S.C.: Learning video object segmentation from unlabeled videos. In: CVPR (2020)
Google Scholar
Lu, X., Wang, W., Danelljan, M., Zhou, T., Shen, J., Van Gool, L.: Video object segmentation with episodic graph memory networks. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12348, pp. 661–679. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58580-8_39
Chapter Google Scholar
Li, X., et al.: Video k-net: a simple, strong, and unified baseline for video segmentation. In: CVPR (2022)
Google Scholar
Kim, D., et al.: TubeFormer-DeepLab: video mask transformer. In: CVPR (2022)
Google Scholar
Choudhuri, A., Chowdhary, G., Schwing, A.G.: Context-aware relative object queries to unify video instance and panoptic segmentation. In: CVPR (2023)
Google Scholar
Athar, A., Hermans, A., Luiten, J., Ramanan, D., Leibe, B.: TarViS: a unified approach for target-based video segmentation. In: CVPR (2023)
Google Scholar
Li, X., et al.: Tube-link: a flexible cross tube baseline for universal video segmentation. In: ICCV (2023)
Google Scholar
He, F., et al.: InsPro: propagating instance query and proposal for online video instance segmentation. In: NeurIPS (2022)
Google Scholar
Heo, M., et al.: A generalized framework for video instance segmentation. In: CVPR (2023)
Google Scholar
Qin, Z., Lu, X., Nie, X., Liu, D., Yin, Y., Wang, W.: Coarse-to-fine video instance segmentation with factorized conditional appearance flows. IEEE/CAA J. Automatica Sinica 10(5), 1192–1208 (2023)
Article Google Scholar
Adelson, E.H.: On seeing stuff: the perception of materials by humans and machines. In: Human Vision and Electronic Imaging VI (2001)
Google Scholar
Loomis, J.M., Philbeck, J.W., Zahorik, P.: Dissociation between location and shape in visual space. J. Exp. Psychol. Hum. Percept. Perform. 28(5), 1202 (2002)
Article Google Scholar
Wang, W., Yang, Y., Pan, Y.: Visual knowledge in the big model era: retrospect and prospect. arXiv preprint arXiv:2404.04308 (2024)
Yang, Z., Chen, G., Li, X., Wang, W., Yang, Y.: DoraemonGPT: toward understanding dynamic scenes with large language models (exemplified as a video agent). In: ICML (2024)
Google Scholar
Wu, J., Liu, Q., Jiang, Y., Bai, S., Yuille, A., Bai, X.: In defense of online models for video instance segmentation. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13688, pp. 588–605. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19815-1_34
Chapter Google Scholar
Athar, A., et al.: BURST: a benchmark for unifying object recognition, segmentation and tracking in video. In: WACV (2023)
Google Scholar
Qi, J., et al.: Occluded video instance segmentation: a benchmark. IJCV 130(8), 2022–2039 (2022)
Article Google Scholar
Miao, J., Wei, Y., Wu, Y., Liang, C., Li, G., Yang, Y.: VSPW: a large-scale dataset for video scene parsing in the wild. In: CVPR (2021)
Google Scholar
Miao, J., et al.: Large-scale video panoptic segmentation in the wild: a benchmark. In: CVPR (2022)
Google Scholar
Yang, Z., Wei, Y., Yang, Y.: Collaborative video object segmentation by multi-scale foreground-background integration. IEEE TPAMI 44, 4701–4712 (2021)
Google Scholar
Miao, J., Wei, Y., Yang, Y.: Memory aggregation networks for efficient interactive video object segmentation. In: CVPR (2020)
Google Scholar
Wu, R., Lin, H., Qi, X., Jia, J.: Memory selection network for video propagation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12360, pp. 175–190. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58555-6_11
Chapter Google Scholar
Wang, W., Lu, X., Shen, J., Crandall, D.J., Shao, L.: Zero-shot video object segmentation via attentive graph neural networks. In: ICCV (2019)
Google Scholar
Wang, X., Jabri, A., Efros, A.A.: Learning correspondence from the cycle-consistency of time. In: CVPR (2019)
Google Scholar
Lu, X., Wang, W., Ma, C., Shen, J., Shao, L., Porikli, F.: See more, know more: Unsupervised video object segmentation with co-attention siamese networks. In: CVPR (2019)
Google Scholar
Wang, W., Shen, J., Porikli, F., Yang, R.: Semi-supervised video object segmentation with super-trajectories. IEEE TPAMI 41(4), 985–998 (2018)
Article Google Scholar
Seong, H., Oh, S.W., Lee, J.Y., Lee, S., Lee, S., Kim, E.: Hierarchical memory matching network for video object segmentation. In: CVPR (2021)
Google Scholar
Mao, Y., Wang, N., Zhou, W., Li, H.: Joint inductive and transductive learning for video object segmentation. In: ICCV (2021)
Google Scholar
Cheng, H.K., Tai, Y.W., Tang, C.K.: Rethinking space-time networks with improved memory coverage for efficient video object segmentation. In: NeurIPS (2021)
Google Scholar
Cheng, H.K., Schwing, A.G.: XMem: long-term video object segmentation with an Atkinson-Shiffrin memory model. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13688, pp. 640–658. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19815-1_37
Chapter Google Scholar
Li, L., Zhou, T., Wang, W., Yang, L., Li, J., Yang, Y.: Locality-aware inter-and intra-video reconstruction for self-supervised correspondence learning. In: CVPR (2022)
Google Scholar
Park, K., Woo, S., Oh, S.W., Kweon, I.S., Lee, J.Y.: Per-clip video object segmentation. In: CVPR (2022)
Google Scholar
Yu, Y., Yuan, J., Mittal, G., Fuxin, L., Chen, M.: BATMAN: bilateral attention transformer in motion-appearance neighboring space for video object segmentation. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13689, pp. 612–629. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19818-2_35
Chapter Google Scholar
Zhang, Y., Li, L., Wang, W., Xie, R., Song, L., Zhang, W.: Boosting video object segmentation via space-time correspondence learning. In: CVPR (2023)
Google Scholar
Li, L., Wang, W., Zhou, T., Li, J., Yang, Y.: Unified mask embedding and correspondence learning for self-supervised video segmentation. In: CVPR (2023)
Google Scholar
Cao, J., Anwer, R.M., Cholakkal, H., Khan, F.S., Pang, Y., Shao, L.: SipMask: spatial information preservation for fast image and video instance segmentation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12359, pp. 1–18. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58568-6_1
Chapter Google Scholar
Liu, D., Cui, Y., Tan, W., Chen, Y.: SG-Net: spatial granularity network for one-stage video instance segmentation. In: CVPR (2021)
Google Scholar
Yang, S., et al.: Crossover learning for fast online video instance segmentation. In: ICCV (2021)
Google Scholar
Han, S.H., et al.: VISOLO: grid-based space-time aggregation for efficient online video instance segmentation. In: CVPR (2022)
Google Scholar
Fang, Y., et al.: Instances as queries. In: ICCV (2021)
Google Scholar
Zhu, F., Yang, Z., Yu, X., Yang, Y., Wei, Y.: Instance as identity: a generic online paradigm for video instance segmentation. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13689, pp. 524–540. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19818-2_30
Chapter Google Scholar
Li, M., Li, S., Li, L., Zhang, L.: Spatial feature calibration and temporal fusion for effective one-stage video instance segmentation. In: CVPR (2021)
Google Scholar
Ke, L., Li, X., Danelljan, M., Tai, Y.W., Tang, C.K., Yu, F.: Prototypical cross-attention networks for multiple object tracking and segmentation. In: NeurIPS (2021)
Google Scholar
Lin, H., Wu, R., Liu, S., Lu, J., Jia, J.: Video instance segmentation with a propose-reduce paradigm. In: ICCV (2021)
Google Scholar
Koner, R., et al.: InstanceFormer: an online video instance segmentation framework. In: AAAI (2023)
Google Scholar
Liu, Q., Wu, J., Jiang, Y., Bai, X., Yuille, A.L., Bai, S.: InstMove: instance motion for object-centric video segmentation. In: CVPR (2023)
Google Scholar
Li, M., Li, S., Xiang, W., Zhang, L.: MDQE: mining discriminative query embeddings to segment occluded instances on challenging videos. In: CVPR (2023)
Google Scholar
Athar, A., Mahadevan, S., Os̆ep, A., Leal-Taixé, L., Leibe, B.: STEm-Seg: spatio-temporal embeddings for instance segmentation in videos. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12356, pp. 158–177. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58621-8_10
Chapter Google Scholar
Wu, J., et al.: Efficient video instance segmentation via tracklet query and proposal. In: CVPR (2022)
Google Scholar
Yang, S., et al.: Temporally efficient vision transformer for video instance segmentation. In: CVPR (2022)
Google Scholar
Bertasius, G., Torresani, L.: Classifying, segmenting, and tracking object instances in video with mask propagation. In: CVPR (2020)
Google Scholar
Hwang, S., Heo, M., Oh, S.W., Kim, S.J.: Video instance segmentation using inter-frame communication transformers. In: NeurIPS (2021)
Google Scholar
Wang, W., Zhou, T., Yu, F., Dai, J., Konukoglu, E., Van Gool, L.: Exploring cross-image pixel contrast for semantic segmentation. In: ICCV (2021)
Google Scholar
Zhou, T., Wang, W., Konukoglu, E., Van Gool, L.: Rethinking semantic segmentation: a prototype view. In: CVPR (2022)
Google Scholar
Li, L., Wang, W., Yang, Y.: LOGICSEG: parsing visual semantics with neural logic learning and reasoning. In: ICCV (2023)
Google Scholar
Chen, M., Zheng, Z., Yang, Y., Chua, T.S.: PiPa: pixel-and patch-wise self-supervised learning for domain adaptative semantic segmentation. In: ACM MM (2023)
Google Scholar
Li, L., Zhou, T., Wang, W., Li, J., Yang, Y.: Deep hierarchical semantic segmentation. In: CVPR (2022)
Google Scholar
Li, L., Wang, W., Zhou, T., Quan, R., Yang, Y.: Semantic hierarchy-aware segmentation. IEEE TPAMI 46, 2123–2138 (2023)
Article Google Scholar
Chen, M., Zheng, Z., Yang, Y.: Transferring to real-world layouts: a depth-aware framework for scene adaptation. arXiv preprint arXiv:2311.12682 (2023)
Zhou, T., Wang, W.: Cross-image pixel contrasting for semantic segmentation. IEEE TPAMI 46, 5398–5412 (2024)
Article Google Scholar
Xu, Y.S., Fu, T.J., Yang, H.K., Lee, C.Y.: Dynamic video segmentation network. In: CVPR (2018)
Google Scholar
Mahasseni, B., Todorovic, S., Fern, A.: Budget-aware deep semantic video segmentation. In: CVPR (2017)
Google Scholar
Nilsson, D., Sminchisescu, C.: Semantic video segmentation by gated recurrent flow propagation. In: CVPR (2018)
Google Scholar
Jain, S., Wang, X., Gonzalez, J.E.: Accel: a corrective fusion network for efficient semantic segmentation on video. In: CVPR (2019)
Google Scholar
Liu, Y., Shen, C., Yu, C., Wang, J.: Efficient semantic video segmentation with per-frame inference. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12355, pp. 352–368. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58607-2_21
Chapter Google Scholar
Li, J., et al.: Video semantic segmentation via sparse temporal transformer. In: ACM MM (2021)
Google Scholar
Sun, G., Liu, Y., Tang, H., Chhatkuli, A., Zhang, L., Van Gool, L.: Mining relations among cross-frame affinities for video semantic segmentation. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13694, pp. 522–539. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19830-4_30
Chapter Google Scholar
Qiao, S., Zhu, Y., Adam, H., Yuille, A., Chen, L.C.: ViP-DeepLab: learning visual perception with depth-aware video panoptic segmentation. In: CVPR (2021)
Google Scholar
Kreuzberg, L., Zulfikar, I.E., Mahadevan, S., Engelmann, F., Leibe, B.: 4D-stop: panoptic segmentation of 4D lidar using spatio-temporal object proposal generation and aggregation. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds.) ECCV 2022. LNCS, vol. 13801, pp. 537–553. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-25056-9_34
Chapter Google Scholar
Zhou, Y., et al.: Slot-VPS: object-centric representation learning for video panoptic segmentation. In: CVPR (2022)
Google Scholar
Yuan, H., et al.: PolyphonicFormer: unified query learning for depth-aware video panoptic segmentation. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13687, pp. 582–599. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19812-0_34
Chapter Google Scholar
He, J., et al.: Towards deeply unified depth-aware panoptic segmentation with bi-directional guidance learning. In: ICCV (2023)
Google Scholar
Shin, I., et al.: Video-kMAX: a simple unified approach for online and near-online video panoptic segmentation. arXiv preprint arXiv:2304.04694 (2023)
Cheng, B., Misra, I., Schwing, A.G., Kirillov, A., Girdhar, R.: Masked-attention mask transformer for universal image segmentation. In: CVPR (2022)
Google Scholar
Zhang, T., et al.: DVIS: decoupled video instance segmentation framework. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1282–1291 (2023)
Google Scholar
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13
Chapter Google Scholar
Cheng, B., Schwing, A., Kirillov, A.: Per-pixel classification is not all you need for semantic segmentation. In: NeurIPS (2021)
Google Scholar
Wang, W., Liang, J.C., Liu, D.: Learning equivariant segmentation with instance-unique querying. In: NeurIPS (2022)
Google Scholar
Ding, Y., Li, L., Wang, W., Yang, Y.: Clustering propagation for universal medical image segmentation. In: CVPR (2024)
Google Scholar
Liang, J.C., Zhou, T., Liu, D., Wang, W.: CLUSTSEG: clustering for universal segmentation. In: ICML (2023)
Google Scholar
Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE TPAMI 24(4), 509–522 (2002)
Article Google Scholar
Vaswani, A., et al.: Attention is all you need. In: NeurIPS (2017)
Google Scholar
Li, Z., et al.: Panoptic SegFormer: delving deeper into panoptic segmentation with transformers. In: CVPR (2022)
Google Scholar
Kirillov, A., Wu, Y., He, K., Girshick, R.: PointRend: image segmentation as rendering. In: CVPR (2020)
Google Scholar
Wang, X., Zhang, R., Kong, T., Li, L., Shen, C.: SOLOv2: dynamic and fast instance segmentation. In: NeurIPS (2020)
Google Scholar
Cheng, H.K., Oh, S.W., Price, B., Schwing, A., Lee, J.Y.: Tracking anything with decoupled video segmentation. In: ICCV (2023)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)
Google Scholar
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: ICCV (2021)
Google Scholar
Cheng, B., Choudhuri, A., Misra, I., Kirillov, A., Girdhar, R., Schwing, A.G.: Mask2former for video instance segmentation. arXiv preprint arXiv:2112.10764 (2021)
Li, J., Yu, B., Rao, Y., Zhou, J., Lu, J.: TCOVIS: temporally consistent online video instance segmentation. In: ICCV (2023)
Google Scholar
Ying, K., et al.: CTVIS: consistent training for online video instance segmentation. In: ICCV (2023)
Google Scholar
Xu, N., et al.: Youtube-VOS: a large-scale video object segmentation benchmark. arXiv preprint arXiv:1809.03327 (2018)
Hoffhues, A., Luiten, J.: Trackeval (2020). https://github.com/JonathonLuiten/TrackEval
Yan, B., et al.: Universal instance perception as object discovery and retrieval. In: CVPR (2023)
Google Scholar

Download references

Author information

Authors and Affiliations

ReLER Lab, AAII, University of Technology Sydney, Ultimo, Australia
Mu Chen & Liulei Li
ReLER Lab, CCAI, Zhejiang University, Hangzhou, China
Wenguan Wang, Ruijie Quan & Yi Yang

Authors

Mu Chen
View author publications
You can also search for this author in PubMed Google Scholar
Liulei Li
View author publications
You can also search for this author in PubMed Google Scholar
Wenguan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Ruijie Quan
View author publications
You can also search for this author in PubMed Google Scholar
Yi Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yi Yang .

Editor information

Editors and Affiliations

University of Birmingham, Birmingham, UK
Aleš Leonardis
University of Trento, Trento, Italy
Elisa Ricci
Technical University of Darmstadt, Darmstadt, Germany
Stefan Roth
Princeton University, Princeton, NJ, USA
Olga Russakovsky
Czech Technical University in Prague, Prague, Czech Republic
Torsten Sattler
École des Ponts ParisTech, Marne-la-Vallée, France
Gül Varol

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 6535 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, M., Li, L., Wang, W., Quan, R., Yang, Y. (2025). General and Task-Oriented Video Segmentation. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15065. Springer, Cham. https://doi.org/10.1007/978-3-031-72667-5_5

Download citation

DOI: https://doi.org/10.1007/978-3-031-72667-5_5
Published: 29 September 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72666-8
Online ISBN: 978-3-031-72667-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

General and Task-Oriented Video Segmentation