FARSE-CNN: Fully Asynchronous, Recurrent and Sparse Event-Based CNN

Santambrogio, Riccardo; Cannici, Marco; Matteucci, Matteo

doi:10.1007/978-3-031-72949-2_1

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15112))

Included in the following conference series:

European Conference on Computer Vision

292 Accesses

Abstract

Event cameras are neuromorphic image sensors that respond to per-pixel brightness changes, producing a stream of asynchronous and spatially sparse events. Currently, the most successful algorithms for event cameras convert batches of events into dense image-like representations that are synchronously processed by deep learning models of frame-based computer vision. These methods discard the inherent properties of events, leading to high latency and computational costs. Following a recent line of works, we propose a model for efficient asynchronous event processing that exploits sparsity. We design the Fully Asynchronous, Recurrent and Sparse Event-Based CNN (FARSE-CNN),, a novel multi-layered architecture which combines the mechanisms of recurrent and convolutional neural networks. To build efficient deep networks, we propose compression modules that allow to learn hierarchical features both in space and time. We theoretically derive the complexity of all components in our architecture, and experimentally validate our method on tasks for object recognition, object detection and gesture recognition. FARSE-CNN achieves similar or better performance than the state-of-the-art among asynchronous methods, with low computational complexity and without relying on a fixed-length history of events. Our code is released at https://github.com/AIRLab-POLIMI/farse-cnn.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 139.99; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Event-Based Asynchronous Sparse Convolutional Networks

Compressed Event Sensing (CES) Volumes for Event Cameras

Article 31 July 2024

A Differentiable Recurrent Surface for Asynchronous Event-Based Data

References

Amir, A., et al.: A low power, fully event-based gesture recognition system. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7388–7397 (2017). https://doi.org/10.1109/CVPR.2017.781
Barbier, T., Teulière, C., Triesch, J.: Spike timing-based unsupervised learning of orientation, disparity, and motion representations in a spiking neural network. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1377–1386 (2021). https://doi.org/10.1109/CVPRW53098.2021.00152
Cannici, M., Ciccone, M., Romanoni, A., Matteucci, M.: Asynchronous convolutional networks for object detection in neuromorphic cameras. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1656–1665 (2019). https://doi.org/10.1109/CVPRW.2019.00209
Cannici, M., Ciccone, M., Romanoni, A., Matteucci, M.: Matrix-lstm: a differentiable recurrent surface for asynchronous event-based data. In: Proceedings of the European Conference on Computer Vision (ECCV), Glasgow, UK, pp. 23–28 (2020)
Google Scholar
Chung, J., Ahn, S., Bengio, Y.: Hierarchical multiscale recurrent neural networks. In: 5th International Conference on Learning Representations, ICLR 2017 - Conference Track Proceedings (2017). www.scopus.com
Everingham, M., Gool, L.V., Williams, C.K.I., Winn, J.M., Zisserman, A.: The pascal visual object classes (voc) challenge. Int. J. Comput. Vision 88(2), 303–338 (2010). https://doi.org/10.1007/s11263-009-0275-4
Article Google Scholar
Falcon, W., The PyTorch Lightning team: PyTorch Lightning (2019). https://doi.org/10.5281/zenodo.3828935. https://github.com/Lightning-AI/lightning
Fei-Fei, L., Fergus, R., Perona, P.: One-shot learning of object categories. IEEE Trans. Pattern Anal. Mach. Intell. 28, 594–611 (2006)
Article Google Scholar
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010). https://doi.org/10.1109/TPAMI.2009.167
Article Google Scholar
Gehrig, D., Loquercio, A., Derpanis, K., Scaramuzza, D.: End-to-end learning of representations for asynchronous event-based data. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 5632–5642 (2019). https://doi.org/10.1109/ICCV.2019.00573
Graham, B.: Sparse 3d convolutional neural networks. In: British Machine Vision Conference (2015)
Google Scholar
Graham, B.: Spatially-sparse convolutional neural networks. arXiv preprint arXiv:1409.6070 (2014)
Graham, B., Engelcke, M., Maaten, L.V.D.: 3d semantic segmentation with submanifold sparse convolutional networks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9224–9232 (2018). https://doi.org/10.1109/CVPR.2018.00961
He, W., et al.: Comparing snns and rnns on neuromorphic vision datasets: similarities and differences. Neural Netw. Off. J. Int. Neural Netw. Soc. 132, 108–120 (2020)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997). https://doi.org/10.1162/neco.1997.9.8.1735
Article Google Scholar
Innocenti, S.U., Becattini, F., Pernici, F., Del Bimbo, A.: Temporal binary representation for event-based action recognition. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp. 10426–10432 (2021). https://doi.org/10.1109/ICPR48806.2021.9412991
Kamal, U., Dash, S., Mukhopadhyay, S.: Associative memory augmented asynchronous spatiotemporal representation learning for event-based perception. In: The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, 1–5 May 2023 (2023). OpenReview.net (2023). https://openreview.net/pdf?id=ZCStthyW-TD
Kingma, D., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (2014)
Google Scholar
Lagorce, X., Orchard, G., Galluppi, F., Shi, B.E., Benosman, R.B.: HOTS: a hierarchy of event-based time-surfaces for pattern recognition. IEEE Trans. Pattern Anal. Mach. Intell. 39(7), 1346–1359 (2017). https://doi.org/10.1109/TPAMI.2016.2574707
Article Google Scholar
Lee, J.H., Delbruck, T., Pfeiffer, M.: Training deep spiking neural networks using backpropagation. Front. Neurosci. 10 (2016). https://doi.org/10.3389/fnins.2016.00508. https://www.frontiersin.org/articles/10.3389/fnins.2016.00508
Li, Y., et al.: Graph-based asynchronous event processing for rapid object recognition. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 914–923 (2021). https://doi.org/10.1109/ICCV48922.2021.00097
Lichtsteiner, P., Posch, C., Delbruck, T.: A 128 $\times $ 128 120 db 15 µs latency asynchronous temporal contrast vision sensor. IEEE J. Solid-State Circuits 43, 566–576 (2008). https://doi.org/10.1109/JSSC.2007.914337
Article Google Scholar
Loshchilov, I., Hutter, F.: SGDR: stochastic gradient descent with warm restarts. In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, 24–26 April 2017, Conference Track Proceedings. OpenReview.net (2017). https://openreview.net/forum?id=Skq89Scxx
Merolla, P.A., et al.: A million spiking-neuron integrated circuit with a scalable communication network and interface. Science 345(6197), 668–673 (2014). https://doi.org/10.1126/science.1254642. https://www.science.org/doi/abs/10.1126/science.1254642
Messikommer, N., Gehrig, D., Loquercio, A., Scaramuzza, D.: Event-based asynchronous sparse convolutional networks. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.M. (eds.) Computer Vision - ECCV 2020, pp. 415–431. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58598-3_25
Chapter Google Scholar
Mitrokhin, A., Hua, Z., Fermüller, C., Aloimonos, Y.: Learning visual motion segmentation using event surfaces. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14402–14411 (2020). https://doi.org/10.1109/CVPR42600.2020.01442
O’Connor, P., Neil, D., Liu, S.C., Delbruck, T., Pfeiffer, M.: Real-time classification and sensor fusion with a spiking deep belief network. Front. Neurosci. 7 (2013). https://doi.org/10.3389/fnins.2013.00178. https://www.frontiersin.org/articles/10.3389/fnins.2013.00178
Orchard, G., Jayawant, A., Cohen, G., Thakor, N.: Converting static image datasets to spiking neuromorphic datasets using saccades. Front. Neurosci. 9 (2015). https://doi.org/10.3389/fnins.2015.00437
Padilla, R., Passos, W.L., Dias, T.L.B., Netto, S.L., da Silva, E.A.B.: A comparative analysis of object detection metrics with a companion open-source toolkit. Electronics 10(3) (2021). https://doi.org/10.3390/electronics10030279. https://www.mdpi.com/2079-9292/10/3/279
Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 32, pp. 8024–8035. Curran Associates, Inc. (2019). http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf
Posch, C., Serrano-Gotarredona, T., Linares-Barranco, B., Delbruck, T.: Retinomorphic event-based vision sensors: Bioinspired cameras with spiking output. Proc. IEEE 102, 1470–1484 (2014). https://doi.org/10.1109/JPROC.2014.2346153
Article Google Scholar
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: Pointnet: deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Google Scholar
Rebecq, H., Ranftl, R., Koltun, V., Scaramuzza, D.: Events-to-video: bringing modern computer vision to event cameras. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779–788 (2016). https://doi.org/10.1109/CVPR.2016.91
Rueckauer, B., Liu, S.C.: Conversion of analog to spiking neural networks using sparse temporal coding. In: 2018 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 1–5 (2018). https://doi.org/10.1109/ISCAS.2018.8351295
Schaefer, S., Gehrig, D., Scaramuzza, D.: AEGNN: asynchronous event-based graph neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12371–12381 (2022)
Google Scholar
Sekikawa, Y., Hara, K., Saito, H.: Eventnet: asynchronous recursive event processing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Google Scholar
Shi, X., et al.: Convolutional lstm network: a machine learning approach for precipitation nowcasting. In: Cortes, C., Lawrence, N., Lee, D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 28. Curran Associates, Inc. (2015). https://proceedings.neurips.cc/paper/2015/file/07563a3fe3bbe7e3ba84431ad9d055af-Paper.pdf
Sironi, A., Brambilla, M., Bourdis, N., Lagorce, X., Benosman, R.B.: HATS: histograms of averaged time surfaces for robust event-based object classification. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1731–1740 (2018)
Google Scholar
Sordoni, A., Bengio, Y., Vahabi, H., Lioma, C., Grue Simonsen, J., Nie, J.Y.: A hierarchical recurrent encoder-decoder for generative context-aware query suggestion. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, CIKM 2015, pp. 553–562. Association for Computing Machinery, New York (2015). https://doi.org/10.1145/2806416.2806493
de Tournemire, P., Nitti, D., Perot, E., Migliore, D., Sironi, A.: A large scale event-based detection dataset for automotive. arXiv preprint arXiv:2001.08499 (2020)
Wozniak, S., Pantazi, A., Bohnstingl, T., Eleftheriou, E.: Deep learning incorporating biologically inspired neural dynamics and in-memory computing. Nat. Mach. Intell. 2, 325–336 (2020). https://doi.org/10.1038/s42256-020-0187-0
Article Google Scholar
Xie, B., Deng, Y., Shao, Z., Liu, H., Li, Y.: Vmv-gcn: volumetric multi-view based graph cnn for event stream classification. IEEE Rob. Autom. Lett. 7(2), 1976–1983 (2022). https://doi.org/10.1109/LRA.2022.3140819
Article Google Scholar
Zhu, A., Yuan, L., Chaney, K., Daniilidis, K.: Ev-flownet: self-supervised optical flow estimation for event-based cameras. In: Proceedings of Robotics: Science and Systems, Pittsburgh, Pennsylvania (2018). https://doi.org/10.15607/RSS.2018.XIV.062
Zhu, A.Z., Yuan, L., Chaney, K., Daniilidis, K.: Unsupervised event-based learning of optical flow, depth, and egomotion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 989–997 (2019)
Google Scholar

Download references

Acknowledgements

This paper is supported by the FAIR (Future Artificial Intelligence Research) project, funded by the NextGenerationEU program within the PNRR-PE-AI scheme (M4C2, investment 1.3, line on Artificial Intelligence).

Author information

Authors and Affiliations

Politecnico di Milano, Milan, Italy
Riccardo Santambrogio & Matteo Matteucci
University of Zurich, Zurich, Switzerland
Marco Cannici

Authors

Riccardo Santambrogio
View author publications
You can also search for this author in PubMed Google Scholar
Marco Cannici
View author publications
You can also search for this author in PubMed Google Scholar
Matteo Matteucci
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Riccardo Santambrogio .

Editor information

Editors and Affiliations

University of Birmingham, Birmingham, UK
Aleš Leonardis
University of Trento, Trento, Italy
Elisa Ricci
Technical University of Darmstadt, Darmstadt, Hessen, Germany
Stefan Roth
Princeton University, Palo Alto, CA, USA
Olga Russakovsky
Czech Technical University in Prague, Prague, Czech Republic
Torsten Sattler
École des Ponts ParisTech, Marne-la-Vallée, France
Gül Varol

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 751 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Santambrogio, R., Cannici, M., Matteucci, M. (2025). FARSE-CNN: Fully Asynchronous, Recurrent and Sparse Event-Based CNN. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15112. Springer, Cham. https://doi.org/10.1007/978-3-031-72949-2_1

Download citation

DOI: https://doi.org/10.1007/978-3-031-72949-2_1
Published: 31 October 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72948-5
Online ISBN: 978-3-031-72949-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

FARSE-CNN: Fully Asynchronous, Recurrent and Sparse Event-Based CNN

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Event-Based Asynchronous Sparse Convolutional Networks

Compressed Event Sensing (CES) Volumes for Event Cameras

A Differentiable Recurrent Surface for Asynchronous Event-Based Data

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 751 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

FARSE-CNN: Fully Asynchronous, Recurrent and Sparse Event-Based CNN

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Event-Based Asynchronous Sparse Convolutional Networks

Compressed Event Sensing (CES) Volumes for Event Cameras

A Differentiable Recurrent Surface for Asynchronous Event-Based Data

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 751 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation