Score to Learn: A Comparative Analysis of Scoring Functions for Active Learning in Robotics

Grigoletto, Riccardo; Maiettini, Elisa; Natale, Lorenzo

doi:10.1007/978-3-030-87156-7_5

Riccardo Grigoletto¹³,
Elisa Maiettini¹³ &
Lorenzo Natale¹³

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12899))

Included in the following conference series:

International Conference on Computer Vision Systems

907 Accesses

Abstract

Accurately detecting objects in unconstrained settings is crucial for robotic agents, such as humanoids, that function in ever-changing environments. Current deep learning based methods achieve remarkable performance on this task on general purpose benchmarks and they are therefore appealing for robotics. However, their high accuracy comes at the price of computationally expensive off-line training and extensive human labeling. These aspects make their adoption in robotics challenging, since they prevent rapid model adaptation and re-training to novel tasks and conditions. Nonetheless, robots, and especially humanoids, being embodied in the surrounding environment, have access to streams of data from their sensors that, even though without supervision, might contain information of the objects of interest. The Weakly-supervised Learning (WSL) framework offers a set of tools to tackle these problems in general-purpose Computer Vision. In this work, we aim at investigating their adoption in the robotics domain which is still at a preliminary stage. We build on previous work, studying the impact of different, so called, scoring functions, which are at the core of WSL methods, on Pascal VOC, a general purpose dataset, and a prototypical robotic setting, i.e. the iCubWorld-Transformations dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://github.com/RiccardoGrigoletto/SSM-Pytorch.
2.
The models have been trained on a single GPU Nvidia TESLA K40 and Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10 GHz.

References

LeCun, Y., Bengio, Y., et al.: Convolutional networks for images, speech, and time series. In: The Handbook of Brain Theory and Neural Networks, vol. 3361.10, p. 1995 (1995)
Google Scholar
He, K., et al.: Mask R-CNN. In: The IEEE International Conference on Computer Vision (ICCV), October 2017
Google Scholar
Tan, M., Pang, R., Le, Q.V.: EfficientDet: scalable and efficient object detection (2020). arXiv:1911.09070 [cs.CV]
Bochkovskiy, A., Wang, C.-Y., Mark Liao, H.-Y.: YOLO4j: optimal speed and accuracy of object detection (2020). arXiv:2004.10934 [cs.CV]
Everingham, M., et al.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010). ISSN 0920-5691, 1573-1405. https://doi.org/10.1007/s11263-009-0275-4. http://link.springer.eom/10.1007/s11263-009-0275-4
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Pereira, F., et al. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. pp. 1097–1105. Curran Associates Inc. (2012). http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. arXiv:1405.0312 [cs], 20 February 2015. arXiv:1405.0312. URL: http://arxiv.org/abs/1405.0312. Accessed 21 May 2020
Maiettini, E., et al.: Interactive data collection for deep learning object detectors on humanoid robots. In: 2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids), pp. 862–868, November 2017. https://doi.org/10.1109/HUMAN0IDS.2017.8246973
Metta, G., et al.: The iCub humanoid robot: an open-systems platform for research in cognitive development. Neural Netw. Official J. Int. Neural Netw. Soc. 23(8–9), 1125–34 (2010). https://doi.org/10.1016/j.neunet.2010.08.010. Jan
Article Google Scholar
Maiettini, E., et al.: A weakly supervised strategy for learning object detection on a humanoid robot. In: 2019 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids), p. 8 (2019)
Google Scholar
Zhou, Z.-H.: A brief introduction to weakly supervised learning. Nat. Sci. Rev. 5(1), 44–53 (2018). https://academic.oup.com/nsr/article/5/1/44/4093912. https://doi.org/10.1093/nsr/nwxl06. ISSN 2095–5138, 2053–714X. Accessed 28 May 2020
Zhang, D., et al.: Weakly supervised object localization and detection: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 1 (2021). https://doi.org/10.1109/TPAMI.2021.3074313
Settles, B.: Active learning literature survey. Technical report, University of Wisconsin-Madison Department of Computer Sciences (2009)
Google Scholar
Settles, B.: Active learning. In: Synthesis Lectures on Artificial Intelligence and Machine Learning (2012)
Google Scholar
Aghdam, H.H., et al.: Active learning for deep detection neural networks. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, Korea (South), pp. 3671–3679, IEEE, October 2019, ISBN 978-1-72814-803-8. https://ieeexplore.ieee.org/document/9009535/. https://doi.org/10.1109/ICCV.2019.00377. Accessed 16 June 2020
Haussmann, E., et al.: Scalable active learning for object detection. In: IEEE Intelligent Vehicles Symposium (IV), IEEE 2020, pp. 1430–1435 (2020)
Google Scholar
Li, Y., Huang, D., Qin, D., Wang, L., Gong, B.: Improving object detection with selective self-supervised self-training. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12374, pp. 589–607. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58526-6_35
Chapter Google Scholar
Dwibedi, D., Misra, I., Hebert, M.: Cut, paste and learn: surprisingly easy synthesis for instance detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1301–1310 (2017)
Google Scholar
Maiettini, E., et al.: On-line object detection: a robotics challenge. Auton. Robot. 1573–7527 (2019). ISSN 0929-5593. http://link.springer.com/10.1007/S10514-019-09894-9. https://doi.org/10.1007/S10514-019-09894-9. Accessed 10 Feb 2020
Wang, K., et al.: Towards human-machine cooperation: self-supervised sample mining for object detection. arXiv:1803.09867 [cs], May 2018. http://eirxiv.org/abs/1803.09867. Accessed 30 Jan 2020
Maiettini, E., et al.: Data-efficient weakly-supervised learning for online object detection under domain shift in robotics (2020). arXiv:2012.14345
Pasquale, G., et al.: Are we done with object recognition? The iCub robot’s perspective. Robot. Auton. Syst. 112, 260–281 (2019). ISSN: 09218890. arXiv:1709.09882. https://doi.org/10.1016/j.robot.2018.11.001. Accessed 13 Jan 2020
Redmon, J., et al.: You only look once: unified, real-time object detection. arXiv:1506.02640 [cs], 9 May 2016. arXiv:1506.02640. Accessed 26 May 2020
Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. CoRR abs/1804.02767 (2018). arXiv:1804.02767
Liu, W., et al.: SSD: single shot multibox detector, pp. 21–37. arXiv:1512.02325 [cs] 9905 (2016). arXiv:1512.02325. Accessed 26 May 2020. https://doi.org/10.1007/978-3-319-46448-0_2
Zhai, S., et al.: DF-SSD: an improved SSD object detection algorithm based on DenseNet and feature fusion. IEEE Access 8, 24344–24357 (2020)
Article Google Scholar
Lin, T.-Y., et al.: Focal loss for dense object detection, 7 February 2018. arXiv:1708.02002. Accessed 26 May 2020
Zhang, S., et al.: Single-shot refinement neural network for object detection. arXiv:1711.06897 [cs], 3 January 2018. arXiv:1711.06897. Accessed 26 May 2020
Law, H., Deng, J.: CornerNet: detecting objects as paired keypoints. In: arXiv:1808.01244 [cs], 18 March 2019. arXiv:1808.01244. http://cirxiv.org/abs/1808.01244. Accessed 26 May 2020
Girshick, R.B., et al.: Rich feature hierarchies for accurate object detection and semantic segmentation. CoRR abs/1311.2524 (2013). arXiv:1311.2524
Girshick, R.: Fast R-CNN. arXiv:1504.08083 [cs], 27 September 2015. arXiv:1504.08083. Accessed 20 May 2020
Ren, S., et al.: Faster R-CNN: towards real-time object detection with region proposal networks. arXiv:1506.01497 [cs], January 2016. arXiv:1506.01497. Accessed 29 Jan 2020
Dai, J., et al.: R-FCN: object detection via region-based fully convolutional networks. arXiv:1605.06409 [cs], 21 June 2016. arXiv:1605.06409. Accessed 26 May 2020
Huang, J., et al.: Speed/accuracy trade-offs for modern convolutional object detectors. arXiv:1611.10012 [cs], 24 April 2017. arXiv:1611.10012. Accessed 28 May 2020
Maiettini, E., et al.: Speeding-up object detection training for robotics with FALKON. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), October 2018, pp. 5770–5776. https://doi.org/10.1109/IR0S.2018.8593990
Ceola, F., et al.: Fast region proposal learning for object detection for robotics (2020). arXiv:2011.12790 [cs.CV]
Ceola, F., et al.: Fast object segmentation learning with kernel-based methods for robotics (2020). arXiv:2011.12805 [cs.CV]
Kirsch, A., van Amersfoort, J., Gal, Y.: BatchBALD: efficient and diverse batch acquisition for deep Bayesian active learning. In: NeurlPS (2019)
Google Scholar
Ash, J.T., et al.: Deep batch active learning by diverse, uncertain gradient lower bounds, January 2020. https://openreview.net/forum?id=0HjEAtQNNWD. Accessed 26 Oct 2020
Kao, C.-C., Lee, T.-Y., Sen, P., Liu, M.-Y.: Localization-aware active learning for object detection. In: Jawahar, C.V., Li, H., Mori, G., Schindler, K. (eds.) ACCV 2018. LNCS, vol. 11366, pp. 506–522. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20876-9_32
Chapter Google Scholar
Desai, S.V., et al.: An adaptive supervision framework for active learning in object detection. arXiv preprint arXiv:1908.02454 (2019)
He, K., et al.: Deep residual learning for image recognition. arXiv:1512.03385 [cs], December 2015. Accessed 09 July 2020

Download references

Author information

Authors and Affiliations

Humanoid Sensing and Perception, Istituto Italiano di Tecnologia, Genoa, Italy
Riccardo Grigoletto, Elisa Maiettini & Lorenzo Natale

Authors

Riccardo Grigoletto
View author publications
You can also search for this author in PubMed Google Scholar
Elisa Maiettini
View author publications
You can also search for this author in PubMed Google Scholar
Lorenzo Natale
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Riccardo Grigoletto .

Editor information

Editors and Affiliations

TU Wien, Vienna, Austria
Markus Vincze
University of Technology Sydney, Sydney, Australia
Timothy Patten
University of California San Diego, La Jolla, CA, USA
Henrik I Christensen
Technical University of Denmark, Kongens Lyngby, Denmark
Lazaros Nalpantidis
Hong Kong University of Science and Technology, Hong Kong, China
Ming Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Grigoletto, R., Maiettini, E., Natale, L. (2021). Score to Learn: A Comparative Analysis of Scoring Functions for Active Learning in Robotics. In: Vincze, M., Patten, T., Christensen, H.I., Nalpantidis, L., Liu, M. (eds) Computer Vision Systems. ICVS 2021. Lecture Notes in Computer Science(), vol 12899. Springer, Cham. https://doi.org/10.1007/978-3-030-87156-7_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-87156-7_5
Published: 19 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87155-0
Online ISBN: 978-3-030-87156-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics