Skip to main content

Advertisement

GFENet: group-wise feature-enhanced network for steering angle prediction by fusing events and images

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Existing end-to-end networks for steering angle prediction usually use images generated by standard cameras as input. However, standard cameras are susceptible to poor lighting conditions and motion blur, which is not conducive to training an accurate and robust end-to-end network. In contrast, biological vision-inspired event cameras overcome the aforementioned shortcomings with their unique working principle and offer significant advantages such as high temporal resolution, high dynamic range and low power consumption. Nevertheless, event cameras generate a lot of noise and are unable to provide texture information on static region. Therefore, these two types of cameras are complementary to each other to some extent. To explore the benefits of fusing information from these two types of cameras in autonomous driving tasks, we propose GFENet, an attention-based two-stream encoder-decoder architecture for steering angle prediction by combining events and images. Firstly, asynchronous and sparse events are converted into synchronous and dense event frames. Then, event frames and corresponding image frames are fed into two symmetric encoders to extract features. Next, We introduce a Group-Wise Feature-Enhanced (GEF) module that can refine features and suppress noise to guide the fusion of two modalities features at different levels. Finally, The final fused features are passed through a simple decoder to predict the steering angle. Experiments results on the DDD20 and EventScape datasets shows that our GFEFNet outperforms the state-of-the-art image-event fusion method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Availability of data

The DDD20 dataset used by the paper are available in the: https://sites.google.com/view/davis-driving-dataset-2020/home. The EventScape used by the paper are available in the: https://rpg.ifi.uzh.ch/RAMNet.html.

References

  1. Yurtsever E, Lambert J, Carballo A, Takeda K (2020) A survey of autonomous driving: Common practices and emerging technologies. IEEE Access 8:58443–58469. https://doi.org/10.1109/ACCESS.2020.2983149

    Article  Google Scholar 

  2. Badue C, Guidolini R, Carneiro RV, Azevedo P, Cardoso VB, Forechi A, Jesus L, Berriel R, Paixão TM, Mutz F, de Paula Veronese L, Oliveira-Santos T, De Souza AF (2021) Self-driving cars: A survey. Expert Syst Appl 165:113816. https://doi.org/10.1016/j.eswa.2020.113816

    Article  MATH  Google Scholar 

  3. Tampuu A, Matiisen T, Semikin M, Fishman D, Muhammad N (2022) A survey of end-to-end driving: Architectures and training methods. IEEE Trans Neural Netw Learn Syst 33(4):1364–1384. https://doi.org/10.1109/TNNLS.2020.3043505

    Article  MATH  Google Scholar 

  4. Kuutti S, Bowden R, Jin Y, Barber P, Fallah S (2021) A survey of deep learning applications to autonomous vehicle control. IEEE Trans Intell Transp Syst 22(2):712–733. https://doi.org/10.1109/TITS.2019.2962338

    Article  Google Scholar 

  5. Saleem H, Riaz F, Mostarda L, Niazi MA, Rafiq A, Saeed S (2021) Steering angle prediction techniques for autonomous ground vehicles: A review. IEEE Access 9:78567–78585. https://doi.org/10.1109/ACCESS.2021.3083890

    Article  Google Scholar 

  6. Chib PS, Singh P (2024) Recent advancements in end-to-end autonomous driving using deep learning: A survey. IEEE Trans Intell Veh 9(1):103–118. https://doi.org/10.1109/TIV.2023.3318070

    Article  MATH  Google Scholar 

  7. Brandli C, Berner R, Yang M, Liu S-C, Delbruck T (2014) A 240 \(\times \) 180 130 db 3 \(\mu \)s latency global shutter spatiotemporal vision sensor. IEEE J Solid-State Circ 49(10):2333–2341. https://doi.org/10.1109/JSSC.2014.2342715

    Article  Google Scholar 

  8. Gallego G, Delbrück T, Orchard G, Bartolozzi C, Taba B, Censi A, Leutenegger S, Davison AJ, Conradt J, Daniilidis K, Scaramuzza D (2022) Event-based vision: A survey. IEEE Trans Pattern Anal Mach Intell 44(1):154–180. https://doi.org/10.1109/TPAMI.2020.3008413

    Article  Google Scholar 

  9. Gehrig D, Rüegg M, Gehrig M, Hidalgo-Carrió J, Scaramuzza D (2021) Combining events and frames using recurrent asynchronous multimodal networks for monocular depth prediction. IEEE Robot Autom Lett 6(2):2822–2829. https://doi.org/10.1109/LRA.2021.3060707

    Article  MATH  Google Scholar 

  10. Hou K, Kong D, Jiang J, Zhuang H, Huang X, Fang Z (2023) Fe-fusion-vpr: Attention-based multi-scale network architecture for visual place recognition by fusing frames and events. IEEE Robot Autom Lett 8(6):3526–3533. https://doi.org/10.1109/LRA.2023.3268850

    Article  Google Scholar 

  11. Tomy A, Paigwar A, Mann KS, Renzaglia A, Laugier C (2022) Fusing event-based and rgb camera for robust object detection in adverse conditions. In: 2022 International conference on robotics and automation (ICRA), pp 933–939. https://doi.org/10.1109/ICRA46639.2022.9812059

  12. Li Z, Liu F, Yang W, Peng S, Zhou J (2022) A survey of convolutional neural networks: Analysis, applications, and prospects. IEEE Trans Neural Netw Learn Syst 33(12):6999–7019. https://doi.org/10.1109/TNNLS.2021.3084827

    Article  MathSciNet  MATH  Google Scholar 

  13. Maqueda AI, Loquercio A, Gallego G, García N, Scaramuzza D (2018) Event-based vision meets deep learning on steering prediction for self-driving cars. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 5419–5427. https://doi.org/10.1109/CVPR.2018.00568

  14. Hu Y, Binas J, Neil D, Liu S-C, Delbruck T (2020) Ddd20 end-to-end event camera driving dataset: Fusing frames and events with deep learning for improved steering prediction. In: 2020 IEEE 23rd international conference on intelligent transportation systems (ITSC), pp 1–6. https://doi.org/10.1109/ITSC45102.2020.9294515

  15. Gandhi A, Adhvaryu K, Poria S, Cambria E, Hussain A (2023) Multimodal sentiment analysis: A systematic review of history, datasets, multimodal fusion methods, applications, challenges and future directions. Inf Fusion 91:424–444. https://doi.org/10.1016/j.inffus.2022.09.025

    Article  Google Scholar 

  16. Pomerleau DA (1988) Alvinn: An autonomous land vehicle in a neural network. In: Proceedings of the 1st international conference on neural information processing systems. NIPS’88, pp 305–313. MIT Press, Cambridge, MA, USA

  17. Rausch V, Hansen A, Solowjow E, Liu C, Kreuzer E, Hedrick JK (2017) Learning a deep neural net policy for end-to-end control of autonomous vehicles. In: 2017 American control conference (ACC), pp 4914–4919. https://doi.org/10.23919/ACC.2017.7963716

  18. Fukuoka R, Shigei N, Miyajima H, Nakamura Y, Miyajima H (2021) Self-driving model car acquiring three-point turn motion by using improved lstm model. Artif Life Robot 26:423–431. https://doi.org/10.1007/s10015-021-00697-9

    Article  MATH  Google Scholar 

  19. Xu H, Gao Y, Yu F, Darrell T (2017) End-to-end learning of driving models from large-scale video datasets. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 3530–3538. https://doi.org/10.1109/CVPR.2017.376

  20. Kim J, Canny J (2017) Interpretable learning for self-driving cars by visualizing causal attention. In: 2017 IEEE international conference on computer vision (ICCV), pp 2961–2969. https://doi.org/10.1109/ICCV.2017.320

  21. Jhung J, Bae I, Moon J, Kim T, Kim J, Kim S (2018) End-to-end steering controller with cnn-based closed-loop feedback for autonomous vehicles. In: 2018 IEEE intelligent vehicles symposium (IV), pp 617–622. https://doi.org/10.1109/IVS.2018.8500440

  22. Bechtel MG, Mcellhiney E, Kim M, Yun H (2018) Deeppicar: A low-cost deep neural network-based autonomous car. In: 2018 IEEE 24th International conference on embedded and real-time computing systems and applications (RTCSA), pp 11–21. https://doi.org/10.1109/RTCSA.2018.00011

  23. Wang Q, Chen L, Tian B, Tian W, Li L, Cao D (2019) End-to-end autonomous driving: An angle branched network approach. IEEE Trans Veh Technol 68(12):11599–11610. https://doi.org/10.1109/TVT.2019.2921918

    Article  MATH  Google Scholar 

  24. Shair ZE, Rawashdeh S (2022) High-temporal-resolution event-based vehicle detection and tracking. Opt Eng 62(3):031209. https://doi.org/10.1117/1.OE.62.3.031209

    Article  MATH  Google Scholar 

  25. Wan Z, Dai Y, Mao Y (2022) Learning dense and continuous optical flow from an event camera. IEEE Trans Image Process 31:7237–7251. https://doi.org/10.1109/TIP.2022.3220938

    Article  MATH  Google Scholar 

  26. Zhou Y, Gallego G, Lu X, Liu S, Shen S (2023) Event-based motion segmentation with spatio-temporal graph cuts. IEEE Trans Neural Netw Learn Syst 34(8):4868–4880. https://doi.org/10.1109/TNNLS.2021.3124580

    Article  Google Scholar 

  27. Chamorro W, Solà J, Andrade-Cetto J (2022) Event-based line slam in real-time. IEEE Robot Autom Lett 7(3):8146–8153. https://doi.org/10.1109/LRA.2022.3187266

    Article  MATH  Google Scholar 

  28. Rodríguez-Gómez JP, Tapia R, Garcia MdMG, Dios JRM-d, Ollero A (2022) Free as a bird: Event-based dynamic sense-and-avoid for ornithopter robot flight. IEEE Robot Autom Lett 7(2):5413–5420. https://doi.org/10.1109/LRA.2022.3153904

    Article  MATH  Google Scholar 

  29. Moeys DP, Corradi F, Kerr E, Vance P, Das G, Neil D, Kerr D, Delbrück T (2016) Steering a predator robot using a mixed frame/event-driven convolutional neural network. In: 2016 Second international conference on event-based control, communication, and signal processing (EBCCSP), pp 1–8. https://doi.org/10.1109/EBCCSP.2016.7605233

  30. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)

  31. Munir F, Azam S, Yow K-C, Lee B-G, Jeon M (2023) Multimodal fusion for sensorimotor control in steering angle prediction. Eng Appl Artif Intell 126:107087. https://doi.org/10.1016/j.engappai.2023.107087

    Article  Google Scholar 

  32. Guo M-H, Xu T-X, Liu J-J, Liu Z-N, Jiang P-T, Mu T-J, Zhang S-H, Martin RR, Cheng M-M, Hu S-M (2022) Attention mechanisms in computer vision: A survey. Comput Vis Med 8(3):331–368. https://doi.org/10.1007/s41095-022-0271-y

    Article  MATH  Google Scholar 

  33. Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)

  34. Li X, Wang W, Hu X, Yang J (2019) Selective kernel networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR)

  35. Woo S, Park J, Lee J-Y, Kweon IS (2018) Cbam: Convolutional block attention module. In: Proceedings of the european conference on computer vision (ECCV)

  36. Wang Q, Wu B, Zhu P, Li P, Zuo W, Hu Q (2020) Eca-net: Efficient channel attention for deep convolutional neural networks. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 11531–11539. https://doi.org/10.1109/CVPR42600.2020.01155

  37. Gehrig D, Loquercio A, Derpanis KG, Scaramuzza D (2019) End-to-end learning of representations for asynchronous event-based data. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV)

  38. Baldwin RW, Liu R, Almatrafi M, Asari V, Hirakawa K (2023) Time-ordered recent event (tore) volumes for event cameras. IEEE Trans Pattern Anal Mach Intell 45(2):2519–2532. https://doi.org/10.1109/TPAMI.2022.3172212

    Article  Google Scholar 

  39. Li Y, Li X, Yang J (2022) Spatial group-wise enhance: Enhancing semantic feature learning in cnn. In: Proceedings of the asian conference on computer vision (ACCV), pp 687–702

  40. Garbin C, Zhu X, Marques O (2020) Dropout vs. batch normalization: an empirical study of their impact to deep learning. Multimed Tools Appl 79:1–39

  41. Zhou W, Gong T, Lei J, Yu L (2023) Dbcnet: Dynamic bilateral cross-fusion network for rgb-t urban scene understanding in intelligent vehicles. IEEE Trans Syst Man Cybern Syst 53(12):7631–7641. https://doi.org/10.1109/TSMC.2023.3298921

    Article  MATH  Google Scholar 

  42. Yi S, Li J, Liu X, Yuan X (2022) Ccaffmnet: Dual-spectral semantic segmentation network with channel-coordinate attention feature fusion module. Neurocomputing 482:236–251. https://doi.org/10.1016/j.neucom.2021.11.056

    Article  MATH  Google Scholar 

  43. Gehrig D, Rüegg M, Gehrig M, Hidalgo-Carrió J, Scaramuzza D (2021) Combining events and frames using recurrent asynchronous multimodal networks for monocular depth prediction. IEEE Robot Autom Lett 6(2):2822–2829. https://doi.org/10.1109/LRA.2021.3060707

    Article  MATH  Google Scholar 

  44. Taverni G, Paul Moeys D, Li C, Cavaco C, Motsnyi V, San Segundo Bello D, Delbruck T (2018) Front and back illuminated dynamic and active pixel vision sensors comparison. IEEE Trans Circ Syst II: Express Briefs 65(5):677–681. https://doi.org/10.1109/TCSII.2018.2824899

    Article  Google Scholar 

Download references

Funding

This work is supported by the Hubei Province Science and Technology Major Project (2021AA010).

Author information

Authors and Affiliations

Authors

Contributions

Duowen Chen designed the network architecture, carried out the implementation, performed the experiments and wrote the manuscript. Jianlang Hu reviewed the manuscript and made revisions. Chi Guo were in charge of overall direction and planning.

Corresponding author

Correspondence to Chi Guo.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, DW., Guo, C. & Hu, JL. GFENet: group-wise feature-enhanced network for steering angle prediction by fusing events and images. Appl Intell 55, 198 (2025). https://doi.org/10.1007/s10489-024-06019-3

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10489-024-06019-3

Keywords