Integer-Valued Training and Spike-Driven Inference Spiking Neural Network for High-Performance and Energy-Efficient Object Detection

Luo, Xinhao; Yao, Man; Chou, Yuhong; Xu, Bo; Li, Guoqi

doi:10.1007/978-3-031-73411-3_15

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15090))

Included in the following conference series:

European Conference on Computer Vision

41 Accesses

Abstract

Brain-inspired Spiking Neural Networks (SNNs) have bio-plausibility and low-power advantages over Artificial Neural Networks (ANNs). Applications of SNNs are currently limited to simple classification tasks because of their poor performance. In this work, we focus on bridging the performance gap between ANNs and SNNs on object detection. Our design revolves around network architecture and spiking neuron. First, the overly complex module design causes spike degradation when the YOLO series is converted to the corresponding spiking version. We design a SpikeYOLO architecture to solve this problem by simplifying the vanilla YOLO and incorporating meta SNN blocks. Second, object detection is more sensitive to quantization errors in the conversion of membrane potentials into binary spikes by spiking neurons. To address this challenge, we design a new spiking neuron that activates Integer values during training while maintaining spike-driven by extending virtual timesteps during inference. The proposed method is validated on both static and neuromorphic object detection datasets. On the static COCO dataset, we obtain 66.2% mAP@50 and 48.9% mAP@50:95, which is +15.0% and +18.7% higher than the prior state-of-the-art SNN, respectively. On the neuromorphic Gen1 dataset, we achieve 67.2% mAP@50, which is +2.5% greater than the ANN with equivalent architecture, and the energy efficiency is improved by 5.7$\times $. Code: https://github.com/BICLab/SpikeYOLO.

X. Luo, M. Yao—Equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Conversion of Siamese networks to spiking neural networks for energy-efficient object tracking

Article 14 February 2022

SiamSNN: Siamese Spiking Neural Networks for Energy-Efficient Object Tracking

Digital design of a spatial-pow-STDP learning block with high accuracy utilizing pow CORDIC for large-scale image classifier spatiotemporal SNN

Article Open access 09 February 2024

Notes

1.
https://github.com/ultralytics/ultralytics.

References

Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 (2020)
Bu, T., Fang, W., Ding, J., DAI, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M
Cao, Y., Chen, Y., Khosla, D.: Spiking deep convolutional neural networks for energy-efficient object recognition. Int. J. Comput. Vision 113, 54–66 (2015)
Article MathSciNet Google Scholar
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) Computer Vision – ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part I, pp. 213–229. Springer International Publishing, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13
Chapter Google Scholar
Chollet, F.: Xception: Deep learning with depthwise separable convolutions. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1800–1807 (2017). https://doi.org/10.1109/CVPR.2017.195
Cordone, L., Miramond, B., Thierion, P.: Object detection with spiking neural networks on automotive event data. In: 2022 International Joint Conference on Neural Networks, pp. 1–8. IEEE (2022)
Google Scholar
Davies, M., et al.: Loihi: a neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018)
Article Google Scholar
De Tournemire, P., Nitti, D., Perot, E., Migliore, D., Sironi, A.: A large scale event-based detection dataset for automotive. arXiv preprint arXiv:2001.08499 (2020)
Deng, S., Li, Y., Zhang, S., Gu, S.: Temporal efficient training of spiking neural network via gradient re-weighting. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=_XNtisL32jv
Diehl, P.U., Neil, D., Binas, J., Cook, M., Liu, S.C., Pfeiffer, M.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks, pp. 1–8. IEEE (2015)
Google Scholar
Ding, X., Zhang, X., Ma, N., Han, J., Ding, G., Sun, J.: Repvgg: making vgg-style convnets great again. In: Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition, pp. 13733–13742 (2021)
Google Scholar
Dosovitskiy, A., et al.: An image is worth 16x16 words: Transformers for image recognition at scale. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=YicbFdNTTy
Eshraghian, J.K., et al.: Training spiking neural networks using lessons from deep learning. Proc. IEEE 111(9), 1016–1054 (2023)
Article Google Scholar
Fang, W., Yu, Z., Chen, Y., Huang, T., Masquelier, T., Tian, Y.: Deep residual learning in spiking neural networks. Adv. Neural. Inf. Process. Syst. 34, 21056–21069 (2021)
Google Scholar
Gallego, G., et al.: Event-based vision: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 44(1), 154–180 (2020)
Article Google Scholar
Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1440–1448 (2015)
Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
Google Scholar
Guo, Y., et al.: Ternary spike: Learning ternary spikes for spiking neural networks. arXiv preprint arXiv:2312.06372 (2023)
Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: Im-loss: information maximization loss for spiking neural networks. Adv. Neural. Inf. Process. Syst. 35, 156–166 (2022)
Google Scholar
Guo, Y., et al.: Rmp-loss: regularizing membrane potential distribution for spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 17391–17401 (2023)
Google Scholar
Han, M., Wang, Q., Zhang, T., Wang, Y., Zhang, D., Xu, B.: Complex dynamic neurons improved spiking transformer network for efficient automatic speech recognition. In: Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI 2023) (2023)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) Computer Vision - ECCV 2016, pp. 630–645. Springer International Publishing, Cham (2016)
Chapter Google Scholar
Horowitz, M.: 1.1 computing’s energy problem (and what we can do about it). In: 2014 IEEE International Solid-State Circuits Conference digest of technical papers, pp. 10–14. IEEE (2014)
Google Scholar
Hu, J., et al.: High-performance temporal reversible spiking neural networks with $\cal{O}(l)$ training memory and $\cal{O}$(1) inference cost. In: Forty-first International Conference on Machine Learning (2024). https://openreview.net/forum?id=s4h6nyjM9H
Hu, Y., Zheng, Q., Jiang, X., Pan, G.: Fast-snn: fast spiking neural network by converting quantized ann. IEEE Trans. Pattern Anal. Mach. Intell. 45(12), 14546–14562 (2023). https://doi.org/10.1109/TPAMI.2023.3275769
Article Google Scholar
Hu, Y., Deng, L., Wu, Y., Yao, M., Li, G.: Advancing spiking neural networks toward deep residual learning. IEEE Transactions on Neural Networks and Learning Systems, pp. 1–15 (2024)
Google Scholar
Kim, S., Park, S., Na, B., Kim, J., Yoon, S.: Towards fast and accurate object detection in bio-inspired spiking neural networks through Bayesian optimization. IEEE Access 9, 2633–2643 (2020)
Article Google Scholar
Kim, S., Park, S., Na, B., Yoon, S.: Spiking-yolo: spiking neural network for energy-efficient object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 11270–11277 (2020)
Google Scholar
Kim, Y., Park, H., Moitra, A., Bhattacharjee, A., Venkatesha, Y., Panda, P.: Rate coding or direct coding: Which one is better for accurate, robust, and energy-efficient spiking neural networks? In: ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 71–75. IEEE (2022)
Google Scholar
Li, C., Jones, E.G., Furber, S.: Unleashing the potential of spiking neural networks with dynamic confidence. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 13304–13314 (2023)
Google Scholar
Li, Y., He, X., Dong, Y., Kong, Q., Zeng, Y.: Spike calibration: Fast and accurate conversion of spiking neural network for object detection and segmentation. arXiv preprint arXiv:2207.02702 (2022)
Li, Y., Geller, T., Kim, Y., Panda, P.: Seenn: towards temporal spiking early exit neural networks. In: Advances in Neural Information Processing Systems, 36 (2024)
Google Scholar
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Google Scholar
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) Computer Vision – ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V, pp. 740–755. Springer International Publishing, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Maass, W.: Networks of spiking neurons: the third generation of neural network models. Neural Netw. 10(9), 1659–1671 (1997)
Article Google Scholar
Merolla, P.A., et al.: A million spiking-neuron integrated circuit with a scalable communication network and interface. Science 345(6197), 668–673 (2014)
Article Google Scholar
Mueller, E., Studenyak, V., Auge, D., Knoll, A.: Spiking transformer networks: a rate coded approach for processing sequential data. In: 2021 7th International Conference on Systems and Informatics (ICSAI), pp. 1–5. IEEE (2021)
Google Scholar
Neftci, E.O., Mostafa, H., Zenke, F.: Surrogate gradient learning in spiking neural networks: bringing the power of gradient-based optimization to spiking neural networks. IEEE Signal Process. Mag. 36(6), 51–63 (2019)
Article Google Scholar
Panda, P., Aketi, S.A., Roy, K.: Toward scalable, efficient, and accurate deep spiking neural networks with backward residual connections, stochastic softmax, and hybridization. Front. Neurosci. 14, 653 (2020)
Article Google Scholar
Pei, J., et al.: Towards artificial general intelligence with hybrid tianjic chip architecture. Nature 572(7767), 106–111 (2019)
Article Google Scholar
Qiu, X., Zhu, R.J., Chou, Y., Wang, Z., Deng, L.j., Li, G.: Gated attention coding for training high-performance and efficient spiking neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, pp. 601–610 (2024)
Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: Unified, real-time object detection. In: Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Google Scholar
Redmon, J., Farhadi, A.: Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, vol. 28 (2015)
Google Scholar
Roy, K., Jaiswal, A., Panda, P.: Towards spike-based machine intelligence with neuromorphic computing. Nature 575(7784), 607–617 (2019)
Article Google Scholar
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)
Google Scholar
Schuman, C.D., Kulkarni, S.R., Parsa, M., Mitchell, J.P., Kay, B., et al.: Opportunities for neuromorphic computing algorithms and applications. Nature Comput. Sci. 2(1), 10–19 (2022)
Article Google Scholar
Sengupta, A., Ye, Y., Wang, R., Liu, C., Roy, K.: Going deeper in spiking neural networks: Vgg and residual architectures. Front. Neurosci. 13, 95 (2019)
Article Google Scholar
Su, Q., et al.: Deep directly-trained spiking neural networks for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6555–6565 (2023)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Google Scholar
Wang, C.Y., Bochkovskiy, A., Liao, H.Y.M.: Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7464–7475 (2023)
Google Scholar
Wang, W., et al.: Pyramid vision transformer: a versatile backbone for dense prediction without convolutions. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 568–578 (2021)
Google Scholar
Wang, Z., Fang, Y., Cao, J., Zhang, Q., Wang, Z., Xu, R.: Masked spiking transformer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1761–1771 (2023)
Google Scholar
Wu, Y., Deng, L., Li, G., Zhu, J., Shi, L.: Spatio-temporal backpropagation for training high-performance spiking neural networks. Front. Neurosci. 12, 331 (2018)
Article Google Scholar
Wu, Y., Deng, L., Li, G., Zhu, J., Xie, Y., Shi, L.: Direct training for spiking neural networks: faster, larger, better. In: Proceedings of the AAAI conference on artificial intelligence, vol. 33, pp. 1311–1318 (2019)
Google Scholar
Yao, M., et al.: Temporal-wise attention spiking neural networks for event streams classification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10221–10230 (2021)
Google Scholar
Yao, M., et al.: Spike-driven transformer v2: meta spiking neural network architecture inspiring the design of next-generation neuromorphic chips. In: The Twelfth International Conference on Learning Representations (2024). https://openreview.net/forum?id=1SIBN5Xyw7
Yao, M., et al.: Inherent redundancy in spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 16924–16934 (2023)
Google Scholar
Yao, M., et al.: Spike-driven transformer. In: Thirty-seventh Conference on Neural Information Processing Systems (2023). https://openreview.net/forum?id=9FmolyOHi5
Yao, M., et al.: Spike-based dynamic computing with asynchronous sensing-computing neuromorphic chip. Nature Commun. 15(1), 4464 (May 2024). https://doi.org/10.1038/s41467-024-47811-6
Yao, M., et al.: Sparser spiking activity can be better: feature refine-and-mask spiking neural network for event-based visual recognition. Neural Netw. 166, 410–423 (2023)
Article Google Scholar
Yao, M., et al.: Attention spiking neural networks. IEEE Trans. Pattern Anal. Mach. Intell. 45(8), 9393–9410 (2023)
Article Google Scholar
Yin, B., Corradi, F., Bohté, S.M.: Accurate and efficient time-domain classification with adaptive spiking recurrent neural networks. Nature Mach. Intell. 3(10), 905–913 (2021)
Article Google Scholar
Yuan, M., Zhang, C., Wang, Z., Liu, H., Pan, G., Tang, H.: Trainable spiking-yolo for low-latency and high-performance object detection. Neural Netw. 172, 106092 (2024)
Article Google Scholar
Zhang, J., et al.: Spiking transformers for event-based single object tracking. In: Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition, pp. 8801–8810 (2022)
Google Scholar
Zhang, J., Tang, L., Yu, Z., Lu, J., Huang, T.: Spike transformer: monocular depth estimation for spiking camera. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part VII, pp. 34–52. Springer Nature Switzerland, Cham (2022). https://doi.org/10.1007/978-3-031-20071-7_3
Chapter Google Scholar
Zheng, H., Wu, Y., Deng, L., Hu, Y., Li, G.: Going deeper with directly-trained larger spiking neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 11062–11070 (2021)
Google Scholar
Zhou, Z., et al.: Spikformer: when spiking neural network meets transformer. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=frE4fUwz_h
Zhu, X., Su, W., Lu, L., Li, B., Wang, X., Dai, J.: Deformable detr: deformable transformers for end-to-end object detection. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=gZ9hCDWe6ke

Download references

Acknowledgements

This work was partially supported by National Distinguished Young Scholars (62325603), and National Natural Science Foundation of China (62236009, U22A20103,62441606), Beijing Natural Science Foundation for Distinguished Young Scholars (JQ21015), China Postdoctoral Science Foundation (GZB20240824, 2024M753497), and CAAI-MindSpore Open Fund, developed on OpenI Community.

Author information

Authors and Affiliations

Institute of Automation, Chinese Academy of Sciences, Beijing, China
Xinhao Luo, Man Yao, Bo Xu & Guoqi Li
Xi’an Jiaotong University, Xi’an, China
Yuhong Chou

Authors

Xinhao Luo
View author publications
You can also search for this author in PubMed Google Scholar
Man Yao
View author publications
You can also search for this author in PubMed Google Scholar
Yuhong Chou
View author publications
You can also search for this author in PubMed Google Scholar
Bo Xu
View author publications
You can also search for this author in PubMed Google Scholar
Guoqi Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guoqi Li .

Editor information

Editors and Affiliations

University of Birmingham, Birmingham, UK
Aleš Leonardis
University of Trento, Trento, Italy
Elisa Ricci
Technical University of Darmstadt, Darmstadt, Germany
Stefan Roth
Princeton University, Princeton, NJ, USA
Olga Russakovsky
Czech Technical University in Prague, Prague, Czech Republic
Torsten Sattler
École des Ponts ParisTech, Marne-la-Vallée, France
Gül Varol

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 557 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Luo, X., Yao, M., Chou, Y., Xu, B., Li, G. (2025). Integer-Valued Training and Spike-Driven Inference Spiking Neural Network for High-Performance and Energy-Efficient Object Detection. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15090. Springer, Cham. https://doi.org/10.1007/978-3-031-73411-3_15

Download citation

DOI: https://doi.org/10.1007/978-3-031-73411-3_15
Published: 23 November 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-73410-6
Online ISBN: 978-3-031-73411-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Integer-Valued Training and Spike-Driven Inference Spiking Neural Network for High-Performance and Energy-Efficient Object Detection

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Conversion of Siamese networks to spiking neural networks for energy-efficient object tracking

SiamSNN: Siamese Spiking Neural Networks for Energy-Efficient Object Tracking

Digital design of a spatial-pow-STDP learning block with high accuracy utilizing pow CORDIC for large-scale image classifier spatiotemporal SNN

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 557 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Integer-Valued Training and Spike-Driven Inference Spiking Neural Network for High-Performance and Energy-Efficient Object Detection

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Conversion of Siamese networks to spiking neural networks for energy-efficient object tracking

SiamSNN: Siamese Spiking Neural Networks for Energy-Efficient Object Tracking

Digital design of a spatial-pow-STDP learning block with high accuracy utilizing pow CORDIC for large-scale image classifier spatiotemporal SNN

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 557 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation