research-article

Exploring Cross-fusion and Curriculum Learning for Multi-modal Human Detection on Drones

Authors:

Francky Catthoor,

Georges GielenAuthors Info & Claims

DroneSE and RAPIDO: System Engineering for constrained embedded systems

Pages 1 - 7

https://doi.org/10.1145/3522784.3522785

Published: 23 June 2022 Publication History

Abstract

In a number of applications ranging from warehouse management to people search and rescue, drones will need to evolve in the vicinity of human agents. In those situations, robust and fail-safe human detection by drones must be provided. However, human detection systems used on drones are currently based on single imaging cameras, beside a growing number of works investigating more robust detection schemes via sensor fusion. In the drone context, the fusion of standard RGB and event-based cameras has emerged, while in the automotive context, the fusion of RGB with radars has been proposed for up-most safety towards environmental conditions. In this paper, our aim is to debut the investigation of RGB, event-based camera and radar fusion. First, we acquire a novel dataset for the task of people detection in an indoor, industrial setting, by mounting the sensor fusion suite on a drone. Then, we propose a baseline convolutional neural network (CNN) architecture augmented with cross-fusion highways for sensor fusion and people detection. To train the network, we propose a novel multimodal curriculum learning procedure and demonstrate that our method (termed SAUL) greatly enhances the robustness of the system towards hard RGB failures ( on the peak F1 score) and provides a significant gain in detection performance ( on the peak F1 score) compared to the BlackIn procedure, previously proposed for cross-fusion network training. Finally, we report the performance of our system through precision-recall curve analysis and perform additional ablation studies to shed light on the key aspect of our system.

References

[1]

Kheireddine Aziz, Eddy De Greef, Maxim Rykunov, André Bourdoux, and Hichem Sahli. 2020. Radar-camera Fusion for Road Target Classification. In 2020 IEEE Radar Conference (RadarConf20). 1–6. https://doi.org/10.1109/RadarConf2043947.2020.9266510

[2]

Y. Bengio, Jérôme Louradour, Ronan Collobert, and Jason Weston. 2009. Curriculum learning. Journal of the American Podiatry Association 60, 6. https://doi.org/10.1145/1553374.1553380

[3]

Holger Caesar, Varun Bankiti, Alex H. Lang, Sourabh Vora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Giancarlo Baldan, and Oscar Beijbom. 2020. nuScenes: A multimodal dataset for autonomous driving. arxiv:1903.11027 [cs.LG]

[4]

Federico Corradi, Guido Adriaans, and Sander Stuijk. 2021. Gyro: A Digital Spiking Neural Network Architecture for Multi-Sensory Data Analytics. In Proceedings of the 2021 Drone Systems Engineering and Rapid Simulation and Performance Evaluation: Methods and Tools Proceedings (Budapest, Hungary) (DroneSE and RAPIDO ’21). Association for Computing Machinery, New York, NY, USA, 9–15. https://doi.org/10.1145/3444950.3444951

Digital Library

[5]

Jeffrey Delmerico, Titus Cieslewski, Henri Rebecq, Matthias Faessler, and Davide Scaramuzza. 2019. Are We Ready for Autonomous Drone Racing? The UZH-FPV Drone Racing Dataset. In 2019 International Conference on Robotics and Automation (ICRA). 6713–6719. https://doi.org/10.1109/ICRA.2019.8793887

[6]

Adam Van Etten. 2018. Satellite Imagery Multiscale Rapid Detection with Windowed Networks. CoRR abs/1809.09978(2018). arXiv:1809.09978http://arxiv.org/abs/1809.09978

[7]

Guillermo Gallego, Tobi Delbruck, Garrick Michael Orchard, Chiara Bartolozzi, Brian Taba, Andrea Censi, Stefan Leutenegger, Andrew Davison, Jorg Conradt, Kostas Daniilidis, and Davide Scaramuzza. 2020. Event-based Vision: A Survey. IEEE Transactions on Pattern Analysis and Machine Intelligence (2020), 1–1. https://doi.org/10.1109/TPAMI.2020.3008413

Digital Library

[8]

Yuhuang Hu, Hongjie Liu, Michael Pfeiffer, and Tobi Delbruck. 2016. DVS Benchmark Datasets for Object Tracking, Action Recognition, and Object Recognition. Frontiers in Neuroscience 10 (2016), 405. https://doi.org/10.3389/fnins.2016.00405

[9]

Forrest N. Iandola, Song Han, Matthew W. Moskewicz, Khalid Ashraf, William J. Dally, and Kurt Keutzer. 2016. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size. arxiv:1602.07360 [cs.CV]

[10]

Yevhen Kuznietsov, Jörg Stückler, and Bastian Leibe. 2017. Semi-Supervised Deep Learning for Monocular Depth Map Prediction. arxiv:1702.02706 [cs.CV]

[11]

Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. 2018. Focal Loss for Dense Object Detection. arxiv:1708.02002 [cs.CV]

[12]

Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C. Berg. 2016. SSD: Single Shot MultiBox Detector. Lecture Notes in Computer Science(2016), 21–37. https://doi.org/10.1007/978-3-319-46448-0_2

[13]

Victor M. Lubecke, Olga Boric-Lubecke, Anders Host-Madsen, and Aly E. Fathy. 2007. Through-the-Wall Radar Life Detection and Monitoring. In 2007 IEEE/MTT-S International Microwave Symposium. 769–772. https://doi.org/10.1109/MWSYM.2007.380053

[14]

Anton Mitrokhin, Cornelia Fermüller, Chethan Parameshwara, and Yiannis Aloimonos. 2018. Event-Based Moving Object Detection and Tracking. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). 1–9. https://doi.org/10.1109/IROS.2018.8593805

[15]

Mostafa Mostafa, Shady Zahran, Adel Moussa, Naser El-Sheimy, and Abu Sesay. 2018. Radar and Visual Odometry Integrated System Aided Navigation for UAVS in GNSS Denied Environment. Sensors 18, 9 (2018). https://doi.org/10.3390/s18092776

[16]

Felix Nobis, Maximilian Geisslinger, Markus Weber, Johannes Betz, and Markus Lienkamp. 2020. A Deep Learning-based Radar and Camera Sensor Fusion Architecture for Object Detection. arxiv:2005.07431 [cs.CV]

[17]

A. Ouaknine, A. Newson, J. Rebut, F. Tupin, and P. Pérez. 2021. CARRADA Dataset: Camera and Automotive Radar with Range-Angle-Doppler Annotations. arxiv:2005.01456 [cs.CV]

[18]

Joseph Redmon and Ali Farhadi. 2018. YOLOv3: An Incremental Improvement. arxiv:1804.02767 [cs.CV]

[19]

Ali Safa, Tim Verbelen, Lars Keuninckx, Ilja Ocket, Mathias Hartmann, André Bourdoux, Francky Catthoor, and Georges G. E. Gielen. 2021. A Low-Complexity Radar Detector Outperforming OS-CFAR for Indoor Drone Obstacle Avoidance. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 14(2021), 9162–9175. https://doi.org/10.1109/JSTARS.2021.3107686

[20]

Muhammet Emin Yanik, Dan Wang, and Murat Torlak. 2020. Development and Demonstration of MIMO-SAR mmWave Imaging Testbeds. IEEE Access 8(2020), 126019–126038. https://doi.org/10.1109/ACCESS.2020.3007877

[21]

Jianxiong Zhou, Rongqiang Zhu, Longchao Li, and Qiang Fu. 2019. Comparision between the RMA and BPA for Indoor SAR Imaging Applications. In 2019 6th Asia-Pacific Conference on Synthetic Aperture Radar (APSAR). 1–4. https://doi.org/10.1109/APSAR46974.2019.9048559

Recommendations

Feature-Level and Decision-Level Fusion of Noncoincidently Sampled Sensors for Land Mine Detection

We present and compare methods for feature-level (predetection) and decision-level (postdetection) fusion of multisensor data. This study emphasizes fusion techniques that are suitable for noncommensurate data sampled at noncoincident points. Decision-...
Drones and Privacy

Drones, also referred to as UAV's Unmanned Aerial Vehicle, are an aircraft without a human pilot. Drones have been used by various military organisations for over a decade, but in recent years drones a have been emerging more and more in commercial and ...
Multi-Modal 3D Object Detection in Autonomous Driving: A Survey
Abstract
The past decade has witnessed the rapid development of autonomous driving systems. However, it remains a daunting task to achieve full autonomy, especially when it comes to understanding the ever-changing, complex driving scenes. To alleviate the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

DroneSE and RAPIDO: System Engineering for constrained embedded systems

January 2022

58 pages

ISBN:9781450395663

DOI:10.1145/3522784

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 June 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

DroneSE and RAPIDO '22

DroneSE and RAPIDO '22: System Engineering for constrained embedded systems

January 17 - 19, 2022

Budapest, Hungary

Acceptance Rates

Overall Acceptance Rate 14 of 28 submissions, 50%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
54
Total Downloads

Downloads (Last 12 months)3
Downloads (Last 6 weeks)0

Reflects downloads up to 09 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten