Spike-EFI: Spiking Neural Network for Event-Based Video Frame Interpolation

Wu, Dong-Sheng; Ma, De

doi:10.1007/978-981-97-0376-0_24

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14403))

Included in the following conference series:

Pacific-Rim Symposium on Image and Video Technology

449 Accesses

Abstract

In order to model complex motions between video frames more accurately, many video frame interpolation methods have introduced event cameras to obtain additional high speed motion information. These methods use deep artificial neural networks (ANN) to process RGB images and event streams. However, traditional ANN-based methods are unable to effectively utilize the sparse and asynchronous nature of event streams, leading to unnecessary computations and increased energy consumption. As an alternative to ANN, spiking neural networks (SNN) can naturally process asynchronous and sparse event streams, reduce computational complexity and achieve lower energy consumption when combined with neuromorphic hardware. In this paper, we propose Spike-EFI, a lightweight fully spiking neural network for event-based video frame interpolation task. A spiking neural network with Leaky-Integrate and Fire (LIF) neuron is utilized to learn from a sensor fusion of RGB frames and event streams. This is also the first attempt to achieve video frame interpolation with neuromorphic computing paradigm. We trained and evaluated our network on public dataset, and the experimental results demonstrated that the proposed method has comparable performance to prior ANN-based methods. Benefitting from the event triggered computing paradigm of SNN, our method also achieves lower computational power consumption compared to ANN-based method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Spike-Temporal Latent Representation for Energy-Efficient Event-to-Video Reconstruction

Learning heterogeneous delays in a layer of spiking neurons for fast motion detection

Article 11 September 2023

EAS-SNN: End-to-End Adaptive Sampling and Representation for Event-Based Detection with Recurrent Spiking Neural Networks

References

Chi, Z., Mohammadi Nasiri, R., Liu, Z., Lu, J., Tang, J., Plataniotis, K.N.: All at once: temporally adaptive multi-frame interpolation with advanced motion modeling. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12372, pp. 107–123. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58583-9_7
Chapter Google Scholar
Davies, M., et al.: Loihi: a neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018)
Article Google Scholar
Gerstner, W., Kistler, W.M., Naud, R., Paninski, L.: Neuronal Dynamics: From Single Neurons to Networks and Models of Cognition. Cambridge University Press (2014)
Google Scholar
Hagenaars, J., Paredes-Vallés, F., De Croon, G.: Self-supervised learning of event-based optical flow with spiking neural networks. Adv. Neural. Inf. Process. Syst. 34, 7167–7179 (2021)
Google Scholar
He, W., et al.: TimeReplayer: unlocking the potential of event cameras for video interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17804–17813 (2022)
Google Scholar
Horowitz, M.: 1.1 computing’s energy problem (and what we can do about it). In: 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), pp. 10–14. IEEE (2014)
Google Scholar
Huang, Z., Zhang, T., Heng, W., Shi, B., Zhou, S.: RIFE: real-time intermediate flow estimation for video frame interpolation. arXiv preprint arXiv:2011.06294 (2020)
Jiang, H., Sun, D., Jampani, V., Yang, M.H., Learned-Miller, E., Kautz, J.: Super SloMo: high quality estimation of multiple intermediate frames for video interpolation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9000–9008 (2018)
Google Scholar
Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: spiking neural network for energy-efficient object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 11270–11277 (2020)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Lee, C., Kosta, A.K., Zhu, A.Z., Chaney, K., Daniilidis, K., Roy, K.: Spike-FlowNet: event-based optical flow estimation with energy-efficient hybrid neural networks. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12374, pp. 366–382. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58526-6_22
Chapter Google Scholar
Lee, H., Kim, T., Chung, T.Y., Pak, D., Ban, Y., Lee, S.: AdaCoF: adaptive collaboration of flows for video frame interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5316–5325 (2020)
Google Scholar
Lee, J.H., Delbruck, T., Pfeiffer, M.: Training deep spiking neural networks using backpropagation. Front. Neurosci. 10, 508 (2016)
Article Google Scholar
Lichtsteiner, P., Posch, C., Delbruck, T.: A 128 $\times $ 128 120 db 15 $\upmu $s latency asynchronous temporal contrast vision sensor. IEEE J. Solid-State Circuits 43(2), 566–576 (2008)
Article Google Scholar
Meyer, S., Wang, O., Zimmer, H., Grosse, M., Sorkinehornung, A.: Phase-based frame interpolation for video. IEEE (2015)
Google Scholar
Niklaus, S., Mai, L., Liu, F.: Video frame interpolation via adaptive convolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 670–679 (2017)
Google Scholar
Paikin, G., Ater, Y., Shaul, R., Soloveichik, E.: EFI-Net: video frame interpolation from fusion of events and frames. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1291–1301 (2021)
Google Scholar
Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Google Scholar
Patel, K., Hunsberger, E., Batir, S., Eliasmith, C.: A spiking neural network for image segmentation. arXiv preprint arXiv:2106.08921 (2021)
Rançon, U., Cuadrado-Anibarro, J., Cottereau, B.R., Masquelier, T.: StereoSpike: depth learning with a spiking neural network. IEEE Access 10, 127428–127439 (2022)
Article Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Shi, Z., Xu, X., Liu, X., Chen, J., Yang, M.H.: Video frame interpolation transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17482–17491 (2022)
Google Scholar
Tulyakov, S., Bochicchio, A., Gehrig, D., Georgoulis, S., Li, Y., Scaramuzza, D.: Time Lens++: event-based frame interpolation with parametric non-linear flow and multi-scale fusion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17755–17764 (2022)
Google Scholar
Tulyakov, S., et al.: Time Lens: event-based video frame interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16155–16164 (2021)
Google Scholar
Werbos, P.J.: Backpropagation through time: what it does and how to do it. Proc. IEEE 78(10), 1550–1560 (1990)
Article Google Scholar
Zhang, X., Yu, L.: Unifying motion deblurring and frame interpolation with events. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17765–17774 (2022)
Google Scholar
Zhu, L., Wang, X., Chang, Y., Li, J., Huang, T., Tian, Y.: Event-based video reconstruction via potential-assisted spiking neural network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3594–3604 (2022)
Google Scholar

Download references

Author information

Authors and Affiliations

Zhejiang University, Hangzhou, 310027, People’s Republic of China
Dong-Sheng Wu & De Ma

Authors

Dong-Sheng Wu
View author publications
You can also search for this author in PubMed Google Scholar
De Ma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to De Ma .

Editor information

Editors and Affiliations

Auckland University of Technology, Auckland, New Zealand
Wei Qi Yan
Auckland University of Technology, Auckland, New Zealand
Minh Nguyen
Auckland University of Technology, Auckland, New Zealand
Parma Nand
Auckland University of Technology, Auckland, New Zealand
Xuejun Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wu, DS., Ma, D. (2024). Spike-EFI: Spiking Neural Network for Event-Based Video Frame Interpolation. In: Yan, W.Q., Nguyen, M., Nand, P., Li, X. (eds) Image and Video Technology. PSIVT 2023. Lecture Notes in Computer Science, vol 14403. Springer, Singapore. https://doi.org/10.1007/978-981-97-0376-0_24

Download citation

DOI: https://doi.org/10.1007/978-981-97-0376-0_24
Published: 12 February 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-0375-3
Online ISBN: 978-981-97-0376-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Spike-EFI: Spiking Neural Network for Event-Based Video Frame Interpolation