The UNICT-TEAM Vision Modules for the Mohamed Bin Zayed International Robotics Challenge 2020

Battiato, Sebastiano; Cantelli, Luciano; D’Urso, Fabio; Farinella, Giovanni Maria; Guarnera, Luca; Guastella, Dario Calogero; Leonardi, Rosario; Noce, Alessia Li; Muscato, Giovanni; Piazza, Alessio; Ragusa, Francesco; Santoro, Corrado; Sutera, Giuseppe; Zappalà, Antonio

doi:10.1007/978-3-031-31407-0_53

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1776))

Included in the following conference series:

International Conference on Computer Vision and Image Processing

333 Accesses

Abstract

Real-world advanced robotics applications cannot be conceived without the employment of onboard visual perception. By perception we refer not only to image acquisition, but more importantly to the information extraction required to carry out the robotic task. In this paper the computer vision system developed by the team of the University of Catania for the Mohamed Bin Zayed International Robotics Challenge 2020 is presented. The two challenges required to: 1) develop a team of drones for grasping a ball attached to another flying vehicle and to pierce a set of randomly placed balloons, 2) to build a wall by adopting a mobile manipulator and a flying vehicle. Several aspects have been taken into account in order to obtain a real-time and robust system, which are crucial features in demanding situations such as the ones posed by the challenges. The experimental results achieved in the real-world setting are reported.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://www.youtube.com/watch?v=u106Vy-XJ7c.

References

Battiato, S., et al.: A system for autonomous landing of a UAV on a moving vehicle. In: Battiato, S., Gallo, G., Schettini, R., Stanco, F. (eds.) ICIAP 2017. LNCS, vol. 10484, pp. 129–139. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68560-1_12
Chapter Google Scholar
blender online community: blender - a 3d modelling and rendering package. Blender foundation, blender institute, Amsterdam. https://www.blender.org
Cantelli, L., et al.: Autonomous landing of a UAV on a moving vehicle for the mbzirc. In: Human-Centric Robotics- Proceedings of the 20th International Conference on Climbing and Walking Robots and the Support Technologies for Mobile Machines, CLAWAR 2017, pp. 197–204 (2018)
Google Scholar
Choi, J., Chun, D., Kim, H., Lee, H.J.: Gaussian yolov3: an accurate and fast object detector using localization uncertainty for autonomous driving. In: The IEEE International Conference on Computer Vision (ICCV) (2019)
Google Scholar
Collet, A., Martinez, M., Srinivasa, S.: The moped framework: object recognition and pose estimation for manipulation. I. J. Robot. Res. 30, 1284–1306 (2011)
Article Google Scholar
Dutta, A., Zisserman, A.: The VIA annotation software for images, audio and video. In: Proceedings of the 27th ACM International Conference on Multimedia. MM 2019, ACM, New York, NY, USA (2019)
Google Scholar
Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
Google Scholar
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. arXiv preprint arXiv:1703.06870 (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. CoRR abs/1406.4729 (2014)
Google Scholar
Hinterstoisser, S., et al.: Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes. In: 2011 International Conference on Computer Vision, pp. 858–865, November 2011
Google Scholar
Hinterstoisser, S., et al.: Model based training, detection and pose estimation of texture-less 3d objects in heavily cluttered scenes. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012. LNCS, vol. 7724, pp. 548–562. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37331-2_42
Chapter Google Scholar
Jiang, B., Luo, R., Mao, J., Xiao, T., Jiang, Y.: Acquisition of localization confidence for accurate object detection. In: The European Conference on Computer Vision (ECCV) (2018)
Google Scholar
Kendall, A., Grimes, M., Cipolla, R.: Convolutional networks for real-time 6-dof camera relocalization. CoRR abs/1505.07427 (2015)
Google Scholar
Law, H., Deng, J.: Cornernet: Detecting objects as paired keypoints. In: The European Conference on Computer Vision (ECCV), September 2018
Google Scholar
Li, Y., Chen, Y., Wang, N., Zhang, Z.: Scale-aware trident networks for object detection. In: ICCV 2019 (2019)
Google Scholar
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Marchand, E., Uchiyama, H., Spindler, F.: Pose estimation for augmented reality: a hands-on survey. IEEE Trans. Vis. Comput. Graph. 22(12), 2633–2651 (2016)
Article Google Scholar
Marder-Eppstein, E.: Project tango. In: Special Interest Group on Computer Graphics and Interactive Techniques Conference, SIGGRAPH 2016, Anaheim, CA, USA, 24–28 July 2016, Real-Time Live! p. 25 (2016)
Google Scholar
Ortis, A., Farinella, G., Torrisi, G., Battiato, S.: Exploiting objective text description of images for visual sentiment analysis. Multimedia Tools Appl. 80, 22323–22346 (2020)
Article Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Google Scholar
Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. arXiv preprint arXiv:1612.08242 (2016)
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. CoRR abs/1804.02767 (2018)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Google Scholar
Rosano, M., Furnari, A., Farinella, G.M.: A comparison of visual navigation approaches based on localization and reinforcement learning in virtual and real environments. In: International Conference on Computer Vision Theory and Applications (VISAPP) (2020)
Google Scholar
Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., Lecun, Y.: Overfeat: integrated recognition, localization and detection using convolutional networks. In: International Conference on Learning Representations (ICLR2014) (2014)
Google Scholar
Sutera, G., Guastella, D.C., Muscato, G.: A lightweight magnetic gripper for an aerial delivery vehicle: design and applications. ACTA IMEKO 10(3), 61–65 (2021)
Article Google Scholar
Tan, Z., Nie, X., Qian, Q., Li, N., Li, H.: Learning to rank proposals for object detection. In: The IEEE International Conference on Computer Vision (ICCV) (2019)
Google Scholar
Tremblay, J., To, T., Sundaralingam, B., Xiang, Y., Fox, D., Birchfield, S.: Deep object pose estimation for semantic robotic grasping of household objects. In: Conference on Robot Learning (CoRL) (2018)
Google Scholar
Wang, C., et al.: Densefusion: 6d object pose estimation by iterative dense fusion (2019)
Google Scholar
Wang, Q., Zhang, L., Bertinetto, L., Hu, W., Torr, P.H.: Fast online object tracking and segmentation: a unifying approach. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2019)
Google Scholar
Wang, T., Anwer, R.M., Cholakkal, H., Khan, F.S., Pang, Y., Shao, L.: Learning rich features at high-speed for single-shot object detection. In: The IEEE International Conference on Computer Vision (ICCV) (2019)
Google Scholar
Wu, Y., He, K.: Group normalization. In: The European Conference on Computer Vision (ECCV) (2018)
Google Scholar
Zakharov, S., Shugurov, I., Ilic, S.: Dpod: 6d pose object detector and refiner. In: The IEEE International Conference on Computer Vision (ICCV), October 2019
Google Scholar
Zhu, R., et al.: Scratchdet: training single-shot object detectors from scratch. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2268–2277 (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Math and Computer Science, University of Catania, Catania, Italy
Sebastiano Battiato, Fabio D’Urso, Giovanni Maria Farinella, Luca Guarnera, Rosario Leonardi, Alessio Piazza, Francesco Ragusa & Corrado Santoro
Department of Electrical Electronic and Computer Engineering, University of Catania, Catania, Italy
Luciano Cantelli, Dario Calogero Guastella, Alessia Li Noce, Giovanni Muscato, Giuseppe Sutera & Antonio Zappalà

Authors

Sebastiano Battiato
View author publications
You can also search for this author in PubMed Google Scholar
Luciano Cantelli
View author publications
You can also search for this author in PubMed Google Scholar
Fabio D’Urso
View author publications
You can also search for this author in PubMed Google Scholar
Giovanni Maria Farinella
View author publications
You can also search for this author in PubMed Google Scholar
Luca Guarnera
View author publications
You can also search for this author in PubMed Google Scholar
Dario Calogero Guastella
View author publications
You can also search for this author in PubMed Google Scholar
Rosario Leonardi
View author publications
You can also search for this author in PubMed Google Scholar
Alessia Li Noce
View author publications
You can also search for this author in PubMed Google Scholar
Giovanni Muscato
View author publications
You can also search for this author in PubMed Google Scholar
Alessio Piazza
View author publications
You can also search for this author in PubMed Google Scholar
Francesco Ragusa
View author publications
You can also search for this author in PubMed Google Scholar
Corrado Santoro
View author publications
You can also search for this author in PubMed Google Scholar
Giuseppe Sutera
View author publications
You can also search for this author in PubMed Google Scholar
Antonio Zappalà
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Luca Guarnera .

Editor information

Editors and Affiliations

Visvesvaraya National Institute of Technology Nagpur, Nagpur, India
Deep Gupta
Visvesvaraya National Institute of Technology Nagpur, Nagpur, India
Kishor Bhurchandi
Indian Institute of Technology Ropar, Rupnagar, India
Subrahmanyam Murala
Indian Institute of Technology Roorkee, Roorkee, India
Balasubramanian Raman
Indian Institute of Technology Roorkee, Roorkee, India
Sanjeev Kumar

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Battiato, S. et al. (2023). The UNICT-TEAM Vision Modules for the Mohamed Bin Zayed International Robotics Challenge 2020. In: Gupta, D., Bhurchandi, K., Murala, S., Raman, B., Kumar, S. (eds) Computer Vision and Image Processing. CVIP 2022. Communications in Computer and Information Science, vol 1776. Springer, Cham. https://doi.org/10.1007/978-3-031-31407-0_53

Download citation

DOI: https://doi.org/10.1007/978-3-031-31407-0_53
Published: 07 May 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-31406-3
Online ISBN: 978-3-031-31407-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

The UNICT-TEAM Vision Modules for the Mohamed Bin Zayed International Robotics Challenge 2020