skip to main content
10.1145/3522784.3522785acmotherconferencesArticle/Chapter ViewAbstractPublication PagesrapidoConference Proceedingsconference-collections
research-article

Exploring Cross-fusion and Curriculum Learning for Multi-modal Human Detection on Drones

Published: 23 June 2022 Publication History

Abstract

In a number of applications ranging from warehouse management to people search and rescue, drones will need to evolve in the vicinity of human agents. In those situations, robust and fail-safe human detection by drones must be provided. However, human detection systems used on drones are currently based on single imaging cameras, beside a growing number of works investigating more robust detection schemes via sensor fusion. In the drone context, the fusion of standard RGB and event-based cameras has emerged, while in the automotive context, the fusion of RGB with radars has been proposed for up-most safety towards environmental conditions. In this paper, our aim is to debut the investigation of RGB, event-based camera and radar fusion. First, we acquire a novel dataset for the task of people detection in an indoor, industrial setting, by mounting the sensor fusion suite on a drone. Then, we propose a baseline convolutional neural network (CNN) architecture augmented with cross-fusion highways for sensor fusion and people detection. To train the network, we propose a novel multimodal curriculum learning procedure and demonstrate that our method (termed SAUL) greatly enhances the robustness of the system towards hard RGB failures ( on the peak F1 score) and provides a significant gain in detection performance ( on the peak F1 score) compared to the BlackIn procedure, previously proposed for cross-fusion network training. Finally, we report the performance of our system through precision-recall curve analysis and perform additional ablation studies to shed light on the key aspect of our system.

References

[1]
Kheireddine Aziz, Eddy De Greef, Maxim Rykunov, André Bourdoux, and Hichem Sahli. 2020. Radar-camera Fusion for Road Target Classification. In 2020 IEEE Radar Conference (RadarConf20). 1–6. https://doi.org/10.1109/RadarConf2043947.2020.9266510
[2]
Y. Bengio, Jérôme Louradour, Ronan Collobert, and Jason Weston. 2009. Curriculum learning. Journal of the American Podiatry Association 60, 6. https://doi.org/10.1145/1553374.1553380
[3]
Holger Caesar, Varun Bankiti, Alex H. Lang, Sourabh Vora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Giancarlo Baldan, and Oscar Beijbom. 2020. nuScenes: A multimodal dataset for autonomous driving. arxiv:1903.11027 [cs.LG]
[4]
Federico Corradi, Guido Adriaans, and Sander Stuijk. 2021. Gyro: A Digital Spiking Neural Network Architecture for Multi-Sensory Data Analytics. In Proceedings of the 2021 Drone Systems Engineering and Rapid Simulation and Performance Evaluation: Methods and Tools Proceedings (Budapest, Hungary) (DroneSE and RAPIDO ’21). Association for Computing Machinery, New York, NY, USA, 9–15. https://doi.org/10.1145/3444950.3444951
[5]
Jeffrey Delmerico, Titus Cieslewski, Henri Rebecq, Matthias Faessler, and Davide Scaramuzza. 2019. Are We Ready for Autonomous Drone Racing? The UZH-FPV Drone Racing Dataset. In 2019 International Conference on Robotics and Automation (ICRA). 6713–6719. https://doi.org/10.1109/ICRA.2019.8793887
[6]
Adam Van Etten. 2018. Satellite Imagery Multiscale Rapid Detection with Windowed Networks. CoRR abs/1809.09978(2018). arXiv:1809.09978http://arxiv.org/abs/1809.09978
[7]
Guillermo Gallego, Tobi Delbruck, Garrick Michael Orchard, Chiara Bartolozzi, Brian Taba, Andrea Censi, Stefan Leutenegger, Andrew Davison, Jorg Conradt, Kostas Daniilidis, and Davide Scaramuzza. 2020. Event-based Vision: A Survey. IEEE Transactions on Pattern Analysis and Machine Intelligence (2020), 1–1. https://doi.org/10.1109/TPAMI.2020.3008413
[8]
Yuhuang Hu, Hongjie Liu, Michael Pfeiffer, and Tobi Delbruck. 2016. DVS Benchmark Datasets for Object Tracking, Action Recognition, and Object Recognition. Frontiers in Neuroscience 10 (2016), 405. https://doi.org/10.3389/fnins.2016.00405
[9]
Forrest N. Iandola, Song Han, Matthew W. Moskewicz, Khalid Ashraf, William J. Dally, and Kurt Keutzer. 2016. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size. arxiv:1602.07360 [cs.CV]
[10]
Yevhen Kuznietsov, Jörg Stückler, and Bastian Leibe. 2017. Semi-Supervised Deep Learning for Monocular Depth Map Prediction. arxiv:1702.02706 [cs.CV]
[11]
Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. 2018. Focal Loss for Dense Object Detection. arxiv:1708.02002 [cs.CV]
[12]
Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C. Berg. 2016. SSD: Single Shot MultiBox Detector. Lecture Notes in Computer Science(2016), 21–37. https://doi.org/10.1007/978-3-319-46448-0_2
[13]
Victor M. Lubecke, Olga Boric-Lubecke, Anders Host-Madsen, and Aly E. Fathy. 2007. Through-the-Wall Radar Life Detection and Monitoring. In 2007 IEEE/MTT-S International Microwave Symposium. 769–772. https://doi.org/10.1109/MWSYM.2007.380053
[14]
Anton Mitrokhin, Cornelia Fermüller, Chethan Parameshwara, and Yiannis Aloimonos. 2018. Event-Based Moving Object Detection and Tracking. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). 1–9. https://doi.org/10.1109/IROS.2018.8593805
[15]
Mostafa Mostafa, Shady Zahran, Adel Moussa, Naser El-Sheimy, and Abu Sesay. 2018. Radar and Visual Odometry Integrated System Aided Navigation for UAVS in GNSS Denied Environment. Sensors 18, 9 (2018). https://doi.org/10.3390/s18092776
[16]
Felix Nobis, Maximilian Geisslinger, Markus Weber, Johannes Betz, and Markus Lienkamp. 2020. A Deep Learning-based Radar and Camera Sensor Fusion Architecture for Object Detection. arxiv:2005.07431 [cs.CV]
[17]
A. Ouaknine, A. Newson, J. Rebut, F. Tupin, and P. Pérez. 2021. CARRADA Dataset: Camera and Automotive Radar with Range-Angle-Doppler Annotations. arxiv:2005.01456 [cs.CV]
[18]
Joseph Redmon and Ali Farhadi. 2018. YOLOv3: An Incremental Improvement. arxiv:1804.02767 [cs.CV]
[19]
Ali Safa, Tim Verbelen, Lars Keuninckx, Ilja Ocket, Mathias Hartmann, André Bourdoux, Francky Catthoor, and Georges G. E. Gielen. 2021. A Low-Complexity Radar Detector Outperforming OS-CFAR for Indoor Drone Obstacle Avoidance. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 14(2021), 9162–9175. https://doi.org/10.1109/JSTARS.2021.3107686
[20]
Muhammet Emin Yanik, Dan Wang, and Murat Torlak. 2020. Development and Demonstration of MIMO-SAR mmWave Imaging Testbeds. IEEE Access 8(2020), 126019–126038. https://doi.org/10.1109/ACCESS.2020.3007877
[21]
Jianxiong Zhou, Rongqiang Zhu, Longchao Li, and Qiang Fu. 2019. Comparision between the RMA and BPA for Indoor SAR Imaging Applications. In 2019 6th Asia-Pacific Conference on Synthetic Aperture Radar (APSAR). 1–4. https://doi.org/10.1109/APSAR46974.2019.9048559

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
DroneSE and RAPIDO: System Engineering for constrained embedded systems
January 2022
58 pages
ISBN:9781450395663
DOI:10.1145/3522784
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 June 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. curriculum learning
  2. deep learning
  3. drones
  4. people detection
  5. sensor fusion

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

DroneSE and RAPIDO '22

Acceptance Rates

Overall Acceptance Rate 14 of 28 submissions, 50%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 54
    Total Downloads
  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 08 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media