Abstract
In order to model complex motions between video frames more accurately, many video frame interpolation methods have introduced event cameras to obtain additional high speed motion information. These methods use deep artificial neural networks (ANN) to process RGB images and event streams. However, traditional ANN-based methods are unable to effectively utilize the sparse and asynchronous nature of event streams, leading to unnecessary computations and increased energy consumption. As an alternative to ANN, spiking neural networks (SNN) can naturally process asynchronous and sparse event streams, reduce computational complexity and achieve lower energy consumption when combined with neuromorphic hardware. In this paper, we propose Spike-EFI, a lightweight fully spiking neural network for event-based video frame interpolation task. A spiking neural network with Leaky-Integrate and Fire (LIF) neuron is utilized to learn from a sensor fusion of RGB frames and event streams. This is also the first attempt to achieve video frame interpolation with neuromorphic computing paradigm. We trained and evaluated our network on public dataset, and the experimental results demonstrated that the proposed method has comparable performance to prior ANN-based methods. Benefitting from the event triggered computing paradigm of SNN, our method also achieves lower computational power consumption compared to ANN-based method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Chi, Z., Mohammadi Nasiri, R., Liu, Z., Lu, J., Tang, J., Plataniotis, K.N.: All at once: temporally adaptive multi-frame interpolation with advanced motion modeling. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12372, pp. 107–123. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58583-9_7
Davies, M., et al.: Loihi: a neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018)
Gerstner, W., Kistler, W.M., Naud, R., Paninski, L.: Neuronal Dynamics: From Single Neurons to Networks and Models of Cognition. Cambridge University Press (2014)
Hagenaars, J., Paredes-Vallés, F., De Croon, G.: Self-supervised learning of event-based optical flow with spiking neural networks. Adv. Neural. Inf. Process. Syst. 34, 7167–7179 (2021)
He, W., et al.: TimeReplayer: unlocking the potential of event cameras for video interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17804–17813 (2022)
Horowitz, M.: 1.1 computing’s energy problem (and what we can do about it). In: 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), pp. 10–14. IEEE (2014)
Huang, Z., Zhang, T., Heng, W., Shi, B., Zhou, S.: RIFE: real-time intermediate flow estimation for video frame interpolation. arXiv preprint arXiv:2011.06294 (2020)
Jiang, H., Sun, D., Jampani, V., Yang, M.H., Learned-Miller, E., Kautz, J.: Super SloMo: high quality estimation of multiple intermediate frames for video interpolation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9000–9008 (2018)
Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: spiking neural network for energy-efficient object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 11270–11277 (2020)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Lee, C., Kosta, A.K., Zhu, A.Z., Chaney, K., Daniilidis, K., Roy, K.: Spike-FlowNet: event-based optical flow estimation with energy-efficient hybrid neural networks. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12374, pp. 366–382. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58526-6_22
Lee, H., Kim, T., Chung, T.Y., Pak, D., Ban, Y., Lee, S.: AdaCoF: adaptive collaboration of flows for video frame interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5316–5325 (2020)
Lee, J.H., Delbruck, T., Pfeiffer, M.: Training deep spiking neural networks using backpropagation. Front. Neurosci. 10, 508 (2016)
Lichtsteiner, P., Posch, C., Delbruck, T.: A 128 \(\times \) 128 120 db 15 \(\upmu \)s latency asynchronous temporal contrast vision sensor. IEEE J. Solid-State Circuits 43(2), 566–576 (2008)
Meyer, S., Wang, O., Zimmer, H., Grosse, M., Sorkinehornung, A.: Phase-based frame interpolation for video. IEEE (2015)
Niklaus, S., Mai, L., Liu, F.: Video frame interpolation via adaptive convolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 670–679 (2017)
Paikin, G., Ater, Y., Shaul, R., Soloveichik, E.: EFI-Net: video frame interpolation from fusion of events and frames. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1291–1301 (2021)
Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Patel, K., Hunsberger, E., Batir, S., Eliasmith, C.: A spiking neural network for image segmentation. arXiv preprint arXiv:2106.08921 (2021)
Rançon, U., Cuadrado-Anibarro, J., Cottereau, B.R., Masquelier, T.: StereoSpike: depth learning with a spiking neural network. IEEE Access 10, 127428–127439 (2022)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Shi, Z., Xu, X., Liu, X., Chen, J., Yang, M.H.: Video frame interpolation transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17482–17491 (2022)
Tulyakov, S., Bochicchio, A., Gehrig, D., Georgoulis, S., Li, Y., Scaramuzza, D.: Time Lens++: event-based frame interpolation with parametric non-linear flow and multi-scale fusion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17755–17764 (2022)
Tulyakov, S., et al.: Time Lens: event-based video frame interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16155–16164 (2021)
Werbos, P.J.: Backpropagation through time: what it does and how to do it. Proc. IEEE 78(10), 1550–1560 (1990)
Zhang, X., Yu, L.: Unifying motion deblurring and frame interpolation with events. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17765–17774 (2022)
Zhu, L., Wang, X., Chang, Y., Li, J., Huang, T., Tian, Y.: Event-based video reconstruction via potential-assisted spiking neural network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3594–3604 (2022)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Wu, DS., Ma, D. (2024). Spike-EFI: Spiking Neural Network for Event-Based Video Frame Interpolation. In: Yan, W.Q., Nguyen, M., Nand, P., Li, X. (eds) Image and Video Technology. PSIVT 2023. Lecture Notes in Computer Science, vol 14403. Springer, Singapore. https://doi.org/10.1007/978-981-97-0376-0_24
Download citation
DOI: https://doi.org/10.1007/978-981-97-0376-0_24
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-0375-3
Online ISBN: 978-981-97-0376-0
eBook Packages: Computer ScienceComputer Science (R0)