FusionSeg: Motion Segmentation by Jointly Exploiting Frames and Events

Wang, Lin; Liu, Zhe; Zhang, Yi; Yang, Shaowu; Shi, Dianxi; Zhang, Yongjun

doi:10.1007/978-3-031-20868-3_20

Lin Wang¹¹,
Zhe Liu¹¹,
Yi Zhang¹²,
Shaowu Yang¹¹,
Dianxi Shi¹² &
…
Yongjun Zhang¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13631))

Included in the following conference series:

Pacific Rim International Conference on Artificial Intelligence

1514 Accesses

Abstract

Segmentation of independently moving objects is an important stage in scene comprehension tasks like tracking and recognition. Frame-based cameras employed for dynamic scenes suffer from motion blur and exposure artifacts due to the sampling principle. In contrast, event-based cameras sample visual information based on scene dynamics and have the advantages of microsecond temporal resolution, high dynamic range, and more. Inspired by the complimentary of frame-based cameras and event-based cameras, we propose a cross-domain motion segmentation method, named FusionSeg, for fusing visual signals from frames and events to improve motion segmentation performance. To solve motion segmentation problem on the multi-objects scenario, we use the identification mechanism to embed multiple objects into the same feature space. In addition, to solve the feature matching and propagation problem, we design a long and short-term temporal-spatial attention. Our FusionSeg is evaluated on public datasets and outperforms the state-of-the-art by 4.7% in terms of detection rate. Experiments also demonstrate our method’s robustness in situations with varying motion patterns and numbers of moving objects.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Instance-Level Moving Object Segmentation from a Single Image with Events

Article 20 February 2025

PMTrack: Multi-object Tracking with Motion-Aware

Appearance-Based Refinement for Object-Centric Motion Segmentation

References

Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13
Chapter Google Scholar
Charig Yang, H.L., Lu, E., Zisserman, A., Xie, W.: Self-supervised video object segmentation by motion grouping (2021)
Google Scholar
Gallego, G., Rebecq, H., Scaramuzza, D.: A unifying contrast maximization framework for event cameras, with applications to motion, depth, and optical flow estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3867–3876 (2018)
Google Scholar
Glover, A., Bartolozzi, C.: Robust visual tracking with a freely-moving event camera. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 3769–3776. IEEE (2017)
Google Scholar
Hinton, G., Vinyals, O., Dean, J., et al.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 2(7) (2015)
Howard, A., et al.: Searching for mobilenetv3. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1314–1324 (2019)
Google Scholar
Kepple, D.R., Lee, D., Prepsius, C., Isler, V., Park, I.M., Lee, D.D.: Jointly learning visual motion and confidence from local patches in event cameras. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12351, pp. 500–516. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58539-6_30
Chapter Google Scholar
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Google Scholar
Liu, L., Ouyang, W., Wang, X., Fieguth, P., Chen, J., Liu, X., Pietikäinen, M.: Deep learning for generic object detection: a survey. Int. J. Comput. Vis. 128(2), 261–318 (2020)
Article MATH Google Scholar
Mitrokhin, A., Fermüller, C., Parameshwara, C., Aloimonos, Y.: Event-based moving object detection and tracking. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1–9. IEEE (2018)
Google Scholar
Mitrokhin, A., Fermüller, C., Parameshwara, C., Aloimonos, Y.: Event-based moving object detection and tracking. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1–9. IEEE (2018)
Google Scholar
Mitrokhin, A., Hua, Z., Fermuller, C., Aloimonos, Y.: Learning visual motion segmentation using event surfaces. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14414–14423 (2020)
Google Scholar
Mitrokhin, A., Ye, C., Fermüller, C., Aloimonos, Y., Delbruck, T.: Ev-imo: Motion segmentation dataset and learning pipeline for event cameras. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 6105–6112. IEEE (2019)
Google Scholar
Mitrokhin, A., Ye, C., Fermüller, C., Aloimonos, Y., Delbruck, T.: Ev-imo: Motion segmentation dataset and learning pipeline for event cameras. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 6105–6112. IEEE (2019)
Google Scholar
Nowozin, S.: Optimal decisions from probabilistic models: the intersection-over-union case. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 548–555 (2014)
Google Scholar
Parameshwara, C.M., Sanket, N.J., Gupta, A., Fermuller, C., Aloimonos, Y.: Moms with events: Multi-object motion segmentation with monocular event cameras. arXiv preprint arXiv:2006.061582(3), 5 (2020)
Polyak, B.T., Juditsky, A.B.: Acceleration of stochastic approximation by averaging. SIAM J. Contr. Optimization 30(4), 838–855 (1992)
Article MathSciNet MATH Google Scholar
Sanket, N.J., et al.: Evdodgenet: Deep dynamic obstacle dodging with event cameras. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 10651–10657. IEEE (2020)
Google Scholar
Stoffregen, T., Gallego, G., Drummond, T., Kleeman, L., Scaramuzza, D.: Event-based motion segmentation by motion compensation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7244–7253 (2019)
Google Scholar
Stoffregen, T., Kleeman, L.: Simultaneous optical flow and segmentation (sofas) using dynamic vision sensor. arXiv preprint arXiv:1805.12326 (2018)
Vasco, V., Glover, A., Mueggler, E., Scaramuzza, D., Natale, L., Bartolozzi, C.: Independent motion detection with event-driven cameras. In: 2017 18th International Conference on Advanced Robotics (ICAR), pp. 530–536. IEEE (2017)
Google Scholar
Wang, Y., et al.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021)
Google Scholar
Wertheimer, M.: Untersuchungen zur lehre von der gestalt. Psychologische forschung 1(1), 47–58 (1922)
Article Google Scholar
Yang, Z., Wei, Y., Yang, Y.: Collaborative video object segmentation by foreground-background integration. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12350, pp. 332–348. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58558-7_20
Chapter Google Scholar
Yang, Z., Wei, Y., Yang, Y.: Associating objects with transformers for video object segmentation. In: Advances in Neural Information Processing Systems, vol. 34 (2021)
Google Scholar
Ye, C., Mitrokhin, A., Fermüller, C., Yorke, J.A., Aloimonos, Y.: Unsupervised learning of dense optical flow, depth and egomotion from sparse event data. arXiv preprint arXiv:1809.08625 (2018)
Zhang, J., Shi, F., Wang, J., Liu, Y.: 3D motion segmentation from straight-line optical flow. In: Sebe, N., Liu, Y., Zhuang, Y., Huang, T.S. (eds.) MCAM 2007. LNCS, vol. 4577, pp. 85–94. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-73417-8_15
Chapter Google Scholar
Zhang, J., Yang, X., Fu, Y., Wei, X., Yin, B., Dong, B.: Object tracking by jointly exploiting frame and event domain. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 13043–13052 (2021)
Google Scholar
Zhou, Y., Gallego, G., Lu, X., Liu, S., Shen, S.: Event-based motion segmentation with spatio-temporal graph cuts. IEEE Transactions on Neural Networks and Learning Systems (2021)
Google Scholar

Download references

Acknowledgements

This work was partially supported by the National Natural Science Foundation of China(No. 91948303).

Author information

Authors and Affiliations

College of Computer, National University of Defense Technology, Changsha, China
Lin Wang, Zhe Liu & Shaowu Yang
Artificial Intelligence Research Center, National Innovation Institute of Defense Technology, Beijing, China
Yi Zhang, Dianxi Shi & Yongjun Zhang

Authors

Lin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhe Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yi Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Shaowu Yang
View author publications
You can also search for this author in PubMed Google Scholar
Dianxi Shi
View author publications
You can also search for this author in PubMed Google Scholar
Yongjun Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yongjun Zhang .

Editor information

Editors and Affiliations

CSIRO Australian e-Health Research Centre, Brisbane, QLD, Australia
Sankalp Khanna
Shanghai Jiao Tong University, Shanghai, China
Jian Cao
University of Tasmania, Hobart, TAS, Australia
Quan Bai
University of Technology Sydney, Sydney, NSW, Australia
Guandong Xu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, L., Liu, Z., Zhang, Y., Yang, S., Shi, D., Zhang, Y. (2022). FusionSeg: Motion Segmentation by Jointly Exploiting Frames and Events. In: Khanna, S., Cao, J., Bai, Q., Xu, G. (eds) PRICAI 2022: Trends in Artificial Intelligence. PRICAI 2022. Lecture Notes in Computer Science, vol 13631. Springer, Cham. https://doi.org/10.1007/978-3-031-20868-3_20

Download citation

DOI: https://doi.org/10.1007/978-3-031-20868-3_20
Published: 04 November 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20867-6
Online ISBN: 978-3-031-20868-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

FusionSeg: Motion Segmentation by Jointly Exploiting Frames and Events