Abstract
Robots collaborating with humans in realistic environments need to be able to detect the tools that can be used and manipulated. However, there is no available dataset or study that addresses this challenge in real settings. In this paper, we fill this gap with a dataset for detecting farming, gardening, office, stonemasonry, vehicle, woodworking, and workshop tools. The scenes in our dataset are snapshots of sophisticated environments with or without humans using the tools. The scenes we consider introduce several challenges for object detection, including the small scale of the tools, their articulated nature, occlusion, inter-class invariance, etc. Moreover, we train and compare several state of the art deep object detectors (including Faster R-CNN, Cascade R-CNN, YOLOv3, RetinaNet, RepPoint, and FreeAnchor) on our dataset. We observe that the detectors have difficulty in detecting especially small-scale tools or tools that are visually similar to parts of other tools. In addition, we provide a novel, practical safety use case with a deep network which checks whether the human worker is wearing the safety helmet, mask, glass, and glove tools. With the dataset, the code and the trained models, our work provides a basis for further research into tools and their use in robotics applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
It is better to call some of these objects as equipment. However, since they provide similar functionalities (being used by a human or a robot while performing a task), we will just use the term tool to refer to all such objects, for the sake of simplicity.
References
Abelha, P., Guerin, F.: Learning how a tool affords by simulating 3D models from the web. In: IROS (2017)
Cai, Z., Vasconcelos, N.: Cascade R-CNN: delving into high quality object detection. arxiv:1712.00726 (2017)
Calli, B., Walsman, A., Singh, A., Srinivasa, S., Abbeel, P., Dollar, A.M.: Benchmarking in manipulation research: the YCB object and model set and benchmarking protocols. arXiv:1502.03143 (2015)
Cao, Z., Hidalgo, G., Simon, T., Wei, S.E., Sheikh, Y.: OpenPose: realtime multi-person 2D pose estimation using part affinity fields. arXiv:1812.08008 (2018)
Chen, K., Wang, J., Pang, J.E.A.: MMDetection: open MMLab detection toolbox and benchmark. arXiv:1906.07155 (2019)
Damen, D., et al.: Scaling egocentric vision: the dataset. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11208, pp. 753–771. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01225-0_44
Dehban, A., Jamone, L., Kampff, A.R., Santos-Victor, J.: A moderately large size dataset to learn visual affordances of objects and tools using iCub humanoid robot. In: ECCV Workshop on Action and Anticipation for Visual Learning (2016)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR (2009)
Duan, K., Bai, S., Xie, L., Qi, H., Huang, Q., Tian, Q.: CenterNet: keypoint triplets for object detection. In: ICCV (2019)
Dutta, A., Gupta, A., Zissermann, A.: VGG image annotator (VIA). http://www.robots.ox.ac.uk/vgg/software/via/ (2016). version: 2.0.5. Accessed 27 Feb 2019
Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. IJCV 88(2), 303–338 (2010)
Girshick, R.: Fast R-CNN. In: ICCV (2015)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR (2014)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)
Hinterstoisser, S., et al.: Model based training, detection and pose estimation of texture-less 3D objects in heavily cluttered scenes. In: ACCV (2012)
Kemp, C.C., Edsinger, A.: Robot manipulation of human tools: autonomous detection and control of task relevant features. In: ICDL (2006)
Li, K., Zhao, X., Bian, J., Tan, M.: Automatic safety helmet wearing detection. arXiv:1802.00264 (2018)
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: ICCV (2017)
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Mar, T., Natale, L., Tikhanoff, V.: A framework for fast, autonomous and reliable tool incorporation on iCub. Front. Robot. AI 5, 98 (2018)
Mar, T., Tikhanoff, V., Metta, G., Natale, L.: Multi-model approach based on 3D functional features for tool affordance learning in robotics. In: Humanoids (2015)
Mar, T., Tikhanoff, V., Natale, L.: What can i do with this tool? Self-supervised learning of tool affordances from their 3-D geometry. TCDS 10(3), 595–610 (2018)
Myers, A., Teo, C.L., Fermüller, C., Aloimonos, Y.: Affordance detection of tool parts from geometric features. In: ICRA (2015)
Nath, N.D., Behzadan, A.H., Paal, S.G.: Deep learning for site safety: real-time detection of personal protective equipment. Autom. Constr. 112, 103085 (2020)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: CVPR (2016)
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. arXiv (2018)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS (2015)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014)
Sun, M., Bradski, G., Xu, B.-X., Savarese, S.: Depth-encoded hough voting for joint object detection and shape recovery. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6315, pp. 658–671. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15555-0_48
Vaskevicius, N., Pathak, K., Ichim, A., Birk, A.: The jacobs robotics approach to object recognition and localization in the context of the ICRA’11 solutions in perception challenge. In: ICRA (2012)
Wu, J., Cai, N., Chen, W., Wang, H., Wang, G.: Automatic detection of hardhats worn by construction personnel: a deep learning approach and benchmark dataset. Autom. Constr. 106, 102894 (2019)
Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: CVPR (2017)
Yang, Z., Liu, S., Hu, H., Wang, L., Lin, S.: Reppoints: point set representation for object detection. In: ICCV (2019)
Zhang, X., Wan, F., Liu, C., Ji, R., Ye, Q.: FreeAnchor: learning to match anchors for visual object detection. In: NeurIPS (2019)
Acknowledgment
This work was supported by the Scientific and Technological Research Council of Turkey (TÜBİTAK) through project called “CIRAK: Compliant robot manipulator support for montage workers in factories” (project no 117E002). The numerical calculations reported in this paper were partially performed at TÜBİTAK ULAKBIM, High Performance and Grid Computing Center (TRUBA resources). We would like to thank Erfan Khalaji for his contributions on an earlier version of the work.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Kurnaz, F.C., Hocaog̃lu, B., Yılmaz, M.K., Sülo, İ., Kalkan, S. (2020). ALET (Automated Labeling of Equipment and Tools): A Dataset for Tool Detection and Human Worker Safety Detection. In: Bartoli, A., Fusiello, A. (eds) Computer Vision – ECCV 2020 Workshops. ECCV 2020. Lecture Notes in Computer Science(), vol 12538. Springer, Cham. https://doi.org/10.1007/978-3-030-66823-5_22
Download citation
DOI: https://doi.org/10.1007/978-3-030-66823-5_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-66822-8
Online ISBN: 978-3-030-66823-5
eBook Packages: Computer ScienceComputer Science (R0)