Abstract
Accurately detecting objects in unconstrained settings is crucial for robotic agents, such as humanoids, that function in ever-changing environments. Current deep learning based methods achieve remarkable performance on this task on general purpose benchmarks and they are therefore appealing for robotics. However, their high accuracy comes at the price of computationally expensive off-line training and extensive human labeling. These aspects make their adoption in robotics challenging, since they prevent rapid model adaptation and re-training to novel tasks and conditions. Nonetheless, robots, and especially humanoids, being embodied in the surrounding environment, have access to streams of data from their sensors that, even though without supervision, might contain information of the objects of interest. The Weakly-supervised Learning (WSL) framework offers a set of tools to tackle these problems in general-purpose Computer Vision. In this work, we aim at investigating their adoption in the robotics domain which is still at a preliminary stage. We build on previous work, studying the impact of different, so called, scoring functions, which are at the core of WSL methods, on Pascal VOC, a general purpose dataset, and a prototypical robotic setting, i.e. the iCubWorld-Transformations dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
The models have been trained on a single GPU Nvidia TESLA K40 and Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10 GHz.
References
LeCun, Y., Bengio, Y., et al.: Convolutional networks for images, speech, and time series. In: The Handbook of Brain Theory and Neural Networks, vol. 3361.10, p. 1995 (1995)
He, K., et al.: Mask R-CNN. In: The IEEE International Conference on Computer Vision (ICCV), October 2017
Tan, M., Pang, R., Le, Q.V.: EfficientDet: scalable and efficient object detection (2020). arXiv:1911.09070 [cs.CV]
Bochkovskiy, A., Wang, C.-Y., Mark Liao, H.-Y.: YOLO4j: optimal speed and accuracy of object detection (2020). arXiv:2004.10934 [cs.CV]
Everingham, M., et al.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010). ISSN 0920-5691, 1573-1405. https://doi.org/10.1007/s11263-009-0275-4. http://link.springer.eom/10.1007/s11263-009-0275-4
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Pereira, F., et al. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. pp. 1097–1105. Curran Associates Inc. (2012). http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. arXiv:1405.0312 [cs], 20 February 2015. arXiv:1405.0312. URL: http://arxiv.org/abs/1405.0312. Accessed 21 May 2020
Maiettini, E., et al.: Interactive data collection for deep learning object detectors on humanoid robots. In: 2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids), pp. 862–868, November 2017. https://doi.org/10.1109/HUMAN0IDS.2017.8246973
Metta, G., et al.: The iCub humanoid robot: an open-systems platform for research in cognitive development. Neural Netw. Official J. Int. Neural Netw. Soc. 23(8–9), 1125–34 (2010). https://doi.org/10.1016/j.neunet.2010.08.010. Jan
Maiettini, E., et al.: A weakly supervised strategy for learning object detection on a humanoid robot. In: 2019 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids), p. 8 (2019)
Zhou, Z.-H.: A brief introduction to weakly supervised learning. Nat. Sci. Rev. 5(1), 44–53 (2018). https://academic.oup.com/nsr/article/5/1/44/4093912. https://doi.org/10.1093/nsr/nwxl06. ISSN 2095–5138, 2053–714X. Accessed 28 May 2020
Zhang, D., et al.: Weakly supervised object localization and detection: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 1 (2021). https://doi.org/10.1109/TPAMI.2021.3074313
Settles, B.: Active learning literature survey. Technical report, University of Wisconsin-Madison Department of Computer Sciences (2009)
Settles, B.: Active learning. In: Synthesis Lectures on Artificial Intelligence and Machine Learning (2012)
Aghdam, H.H., et al.: Active learning for deep detection neural networks. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, Korea (South), pp. 3671–3679, IEEE, October 2019, ISBN 978-1-72814-803-8. https://ieeexplore.ieee.org/document/9009535/. https://doi.org/10.1109/ICCV.2019.00377. Accessed 16 June 2020
Haussmann, E., et al.: Scalable active learning for object detection. In: IEEE Intelligent Vehicles Symposium (IV), IEEE 2020, pp. 1430–1435 (2020)
Li, Y., Huang, D., Qin, D., Wang, L., Gong, B.: Improving object detection with selective self-supervised self-training. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12374, pp. 589–607. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58526-6_35
Dwibedi, D., Misra, I., Hebert, M.: Cut, paste and learn: surprisingly easy synthesis for instance detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1301–1310 (2017)
Maiettini, E., et al.: On-line object detection: a robotics challenge. Auton. Robot. 1573–7527 (2019). ISSN 0929-5593. http://link.springer.com/10.1007/S10514-019-09894-9. https://doi.org/10.1007/S10514-019-09894-9. Accessed 10 Feb 2020
Wang, K., et al.: Towards human-machine cooperation: self-supervised sample mining for object detection. arXiv:1803.09867 [cs], May 2018. http://eirxiv.org/abs/1803.09867. Accessed 30 Jan 2020
Maiettini, E., et al.: Data-efficient weakly-supervised learning for online object detection under domain shift in robotics (2020). arXiv:2012.14345
Pasquale, G., et al.: Are we done with object recognition? The iCub robot’s perspective. Robot. Auton. Syst. 112, 260–281 (2019). ISSN: 09218890. arXiv:1709.09882. https://doi.org/10.1016/j.robot.2018.11.001. Accessed 13 Jan 2020
Redmon, J., et al.: You only look once: unified, real-time object detection. arXiv:1506.02640 [cs], 9 May 2016. arXiv:1506.02640. Accessed 26 May 2020
Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. CoRR abs/1804.02767 (2018). arXiv:1804.02767
Liu, W., et al.: SSD: single shot multibox detector, pp. 21–37. arXiv:1512.02325 [cs] 9905 (2016). arXiv:1512.02325. Accessed 26 May 2020. https://doi.org/10.1007/978-3-319-46448-0_2
Zhai, S., et al.: DF-SSD: an improved SSD object detection algorithm based on DenseNet and feature fusion. IEEE Access 8, 24344–24357 (2020)
Lin, T.-Y., et al.: Focal loss for dense object detection, 7 February 2018. arXiv:1708.02002. Accessed 26 May 2020
Zhang, S., et al.: Single-shot refinement neural network for object detection. arXiv:1711.06897 [cs], 3 January 2018. arXiv:1711.06897. Accessed 26 May 2020
Law, H., Deng, J.: CornerNet: detecting objects as paired keypoints. In: arXiv:1808.01244 [cs], 18 March 2019. arXiv:1808.01244. http://cirxiv.org/abs/1808.01244. Accessed 26 May 2020
Girshick, R.B., et al.: Rich feature hierarchies for accurate object detection and semantic segmentation. CoRR abs/1311.2524 (2013). arXiv:1311.2524
Girshick, R.: Fast R-CNN. arXiv:1504.08083 [cs], 27 September 2015. arXiv:1504.08083. Accessed 20 May 2020
Ren, S., et al.: Faster R-CNN: towards real-time object detection with region proposal networks. arXiv:1506.01497 [cs], January 2016. arXiv:1506.01497. Accessed 29 Jan 2020
Dai, J., et al.: R-FCN: object detection via region-based fully convolutional networks. arXiv:1605.06409 [cs], 21 June 2016. arXiv:1605.06409. Accessed 26 May 2020
Huang, J., et al.: Speed/accuracy trade-offs for modern convolutional object detectors. arXiv:1611.10012 [cs], 24 April 2017. arXiv:1611.10012. Accessed 28 May 2020
Maiettini, E., et al.: Speeding-up object detection training for robotics with FALKON. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), October 2018, pp. 5770–5776. https://doi.org/10.1109/IR0S.2018.8593990
Ceola, F., et al.: Fast region proposal learning for object detection for robotics (2020). arXiv:2011.12790 [cs.CV]
Ceola, F., et al.: Fast object segmentation learning with kernel-based methods for robotics (2020). arXiv:2011.12805 [cs.CV]
Kirsch, A., van Amersfoort, J., Gal, Y.: BatchBALD: efficient and diverse batch acquisition for deep Bayesian active learning. In: NeurlPS (2019)
Ash, J.T., et al.: Deep batch active learning by diverse, uncertain gradient lower bounds, January 2020. https://openreview.net/forum?id=0HjEAtQNNWD. Accessed 26 Oct 2020
Kao, C.-C., Lee, T.-Y., Sen, P., Liu, M.-Y.: Localization-aware active learning for object detection. In: Jawahar, C.V., Li, H., Mori, G., Schindler, K. (eds.) ACCV 2018. LNCS, vol. 11366, pp. 506–522. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20876-9_32
Desai, S.V., et al.: An adaptive supervision framework for active learning in object detection. arXiv preprint arXiv:1908.02454 (2019)
He, K., et al.: Deep residual learning for image recognition. arXiv:1512.03385 [cs], December 2015. Accessed 09 July 2020
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Grigoletto, R., Maiettini, E., Natale, L. (2021). Score to Learn: A Comparative Analysis of Scoring Functions for Active Learning in Robotics. In: Vincze, M., Patten, T., Christensen, H.I., Nalpantidis, L., Liu, M. (eds) Computer Vision Systems. ICVS 2021. Lecture Notes in Computer Science(), vol 12899. Springer, Cham. https://doi.org/10.1007/978-3-030-87156-7_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-87156-7_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87155-0
Online ISBN: 978-3-030-87156-7
eBook Packages: Computer ScienceComputer Science (R0)