Abstract
We propose an evaluation framework that emulates poor image exposure conditions, low-range image sensors, lossy compression, as well as noise types which are common in robot vision. We present a rigorous evaluation of the robustness of several high-level image recognition models and investigate their performance under distinct image distortions. On one hand, F1 score shows that the majority of CNN models are slightly affected by mild exposure, strong compression, and Poisson Noise. On the other hand, there is a large decrease in precision and accuracy in extreme misexposure, impulse noise, or signal-dependent noise. Using the proposed framework, we obtain a detailed evaluation of a variety of traditional image distortions, typically found in robotics and automated systems pipelines, provides insights and guidance for further development. We propose a pipeline-based approach to mitigate the adverse effects of image distortions by including an image pre-processing step which intends to estimate the proper exposure and reduce noise artifacts. Moreover, we explore the impacts of the image distortions on the segmentation task, a task that plays a primary role in autonomous navigation, obstacle avoidance, object picking and other robotics tasks.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Availability of Data and Material
The ImageNet dataset is available at http://www.image-net.org. Trained models for Mask-RCNN are available at https://github.com/matterport/Mask_RCNN. Trained models for object recognition are available at https://github.com/keras-team/keras.
Code Availability
The code to reproduce the contributions of this paper is available at GitHub https://git.io/JUgIz.
References
Abdelhamed, A., Brubaker, M.A., Brown, M.S.: Noise flow: Noise modeling with conditional normalizing flows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2019)
Chen, C., Seff, A., Kornhauser, A., Xiao, J.: Deepdriving: Learning affordance for direct perception in autonomous driving. In: The IEEE International Conference on Computer Vision (ICCV) (2015)
Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European Conference On Computer Vision (ECCV), pp 801–818 (2018)
Chollet, F.: Xception: Deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference On Computer Vision And Pattern Recognition, pp. 1251–1258 (2017)
Diane, S.A., Lesiv, E.A., Pesheva, I.A., Neschetnaya, A.Y.: Multi-Aspect Environment Mapping with a Group of Mobile Robots. In: 2019 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConrus), pp 478–482. IEEE (2019)
Drews, P. Jr, Hernández, E., Elfes, A., Nascimento, E.R., Campos, M.: Real-time monocular obstacle avoidance using underwater dark channel prior. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4672–4677 (2016)
Du, D., Qi, Y., Yu, H., Yang, Y., Duan, K., Li, G., Zhang, W., Huang, Q., Tian, Q.: The unmanned aerial vehicle benchmark: Object detection and tracking. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 370–386 (2018)
Emara, T., Abd El Munim, H.E., Abbas, H.M.: Liteseg: A Novel Lightweight Convnet for Semantic Segmentation. In: 2019 Digital Image Computing: Techniques and Applications (DICTA), pp. 1–7. IEEE (2019)
Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge. Int J Comput Vis 88(2), 303–338 (2010)
Fawzi, A., Fawzi, O., Frossard, P.: Analysis of classifiers’ robustness to adversarial perturbations. Machine Learn. 107(3), 481–508 (2018). https://doi.org/10.1007/s10994-017-5663-3
Gao, F., Wang, C., Li, L., Zhang, D.: Altitude information acquisition of uav based on monocular vision and mems. J. Int. Robot. Syst. 1–12 (2019)
Gaya, J.O., Gonçalves, L.T., Duarte, A.C., Zanchetta, B., Drews, P. Jr, Botelho, S.S.C.: Vision-based obstacle avoidance using deep learning. In: 2016 XIII Latin American Robotics Symposium and IV Brazilian Robotics Symposium (LARS/SBR), pp. 7–12 (2016)
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? the Kitti vision benchmark suite. In: Conference on Computer Vision and Pattern Recognition (CVPR), p. 8 (2012)
Gu, C., Sun, C., Ross, D.A., Vondrick, C., Pantofaru, C., Li, Y., Vijayanarasimhan, S., Toderici, G., Ricco, S., Sukthankar, R., et al: Ava: A video dataset of spatio-temporally localized atomic visual actions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6047–6056 (2018)
Ha, I., Kim, H., Park, S., Kim, H.: Image retrieval using bim and features from pretrained vgg network for indoor localization. Build. Environ. 140, 23–31 (2018)
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: European Conference on Computer Vision, pp. 630–645. Springer (2016)
Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861 (2017)
Hu, Y.T., Huang, J.B., Schwing, A.: Maskrnn: Instance level video object segmentation. In: Advances in Neural Information Processing Systems, pp. 325–334 (2017)
Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Iocchi, L., Holz, D., Ruiz-del Solar, J., Sugiura, K., Van Der Zant, T.: Robocup@ home: Analysis and results of evolving competitions for domestic and service robots. Artif. Intell. 229, 258–281 (2015)
Ito, K., Okano, T., Aoki, T.: Recent advances in biometrie security: A case study of liveness detection in face recognition. In: 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 220–227. IEEE (2017)
Jia, W., Tian, Y., Luo, R., Zhang, Z., Lian, J., Zheng, Y.: Detection and segmentation of overlapped fruits based on optimized mask r-cnn application in apple harvesting robot. Comput. Electron. Agric. 172, 105380 (2020)
Karim, R., Islam, M.A., Mohammed, N., Bruce, N.D.: On the robustness of deep learning models to universal adversarial attack. In: 2018 15th Conference on Computer and Robot Vision (CRV), pp. 55–62. IEEE (2018)
Kohli, P., Chadha, A.: Enabling pedestrian safety using computer vision techniques: A case study of the 2018 Uber Inc. Self-driving car crash. In: Future of Information and Communication Conference, pp. 261–279. Springer (2019)
Kokil, P., Pratap, T.: Additive white gaussian noise level estimation for natural images using linear scale-space features. Circ Syst Signal Process 1–22 (2020)
Krizhevsky, A., Nair, V., Hinton, G.: The cifar-10 dataset. online: http://www.cs.toronto.edu/kriz/cifar.html55 (2014)
Kuznetsova, A., Rom, H., Alldrin, N., Uijlings, J., Krasin, I., Pont-Tuset, J., Kamali, S., Popov, S., Malloci, M., Duerig, T., et al.: The open images dataset v4: Unified image classification, object detection, and visual relationship detection at scale. arXiv:1811.009821811.00982 (2018)
Li, A., Thotakuri, M., Ross, D.A., Carreira, J., Vostrikov, A., Zisserman, A.: The ava-kinetics localized human actions video dataset. arXiv:2005.00214 (2020)
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference On Computer Vision And Pattern Recognition, pp. 2117–2125 (2017)
Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft Coco: Common objects in context. In: European Conference on Computer Vision, pp. 740–755. Springer (2014)
Liu, W., Hu, J., Wang, W.: A novel camera fusion method based on switching scheme and occlusion-aware object detection for real-time robotic grasping. J. Int. Robot. Syst. 1–18 (2020)
Liu, Y.P., Yang, C.H., Ling, H., Mabu, S., Kuremoto, T.: A visual system of citrus picking robot using convolutional neural networks. In: 2018 5Th International Conference on Systems and Informatics (ICSAI), pp. 344–349. IEEE (2018)
Ma, L.Y., Xie, W., Huang, H.B.: Convolutional neural network based obstacle detection for unmanned surface vehicle. Math. Biosci. Eng. MBE 17(1), 845–861 (2019)
Maity, A., Pattanaik, A., Sagnika, S., Pani, S.: A comparative study on approaches to speckle noise reduction in images. In: 2015 International Conference on Computational Intelligence and Networks, pp. 148–155. IEEE (2015)
Molina, M., Frau, P., Maravall, D.: A collaborative approach for surface inspection using aerial robots and computer vision. Sensors 18(3), 893 (2018)
Osherov, E., Lindenbaum, M.: Increasing Cnn robustness to occlusions by reducing filter support. In: The IEEE International Conference on Computer Vision (ICCV) (2017)
Piyathilaka, L., Kodagoda, S.: Human activity recognition for domestic robots. In: Field and Service Robotics, pp. 395–408. Springer (2015)
Qian, K., Jing, X., Duan, Y., Zhou, B., Fang, F., Xia, J., Ma, X.: Grasp pose detection with affordance-based task constraint learning in single-view point clouds. J. Int. Robot. Syst. (2020)
Qiu, K., Ai, Y., Tian, B., Wang, B., Cao, D.: Siamese-Resnet: Implementing loop closure detection based on siamese network. In: 2018 IEEE Intelligent Vehicles Symposium (IV), pp. 716–721. IEEE (2018)
Recht, B., Roelofs, R., Schmidt, L., Shankar, V.: Do imagenet classifiers generalize to imagenet? arXiv:1902.10811 (2019)
Ren, R., Guo, Z., Jia, Z., Yang, J., Kasabov, N.K., Li, C.: Speckle noise removal in image-based detection of refractive index changes in porous silicon microarrays. Scientif. Rep. 9(1), 1–14 (2019)
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-fei, L.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vision (IJCV) 115(3), 211–252 (2015). https://doi.org/10.1007/s11263-015-0816-y
Sánchez-Ramírez, E.E., Rosales-Silva, A.J., Alfaro-Flores, R.A.: High-precision visual-tracking using the imm algorithm and discrete gpi observers (imm-dgpio). J. Intell. Robot. Syst. 99(3), 815–835 (2020)
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)
Sharadqh, J.A.A.A., Ayyoub, B., Alqadi, Z., Al-azzeh, J.: Experimental investigation of method used to remove salt and pepper noise from digital color image. Int. J. Res. Adv. Eng. Technol. 5(1), 23–31 (2019)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014)
Soares, L.B., Weis, A.́A., Rodrigues, R.N., Drews, P.L., Guterres, B., Botelho, S.S., Nelson Filho, D.: Seam tracking and welding bead geometry analysis for autonomous welding robot. In: 2017 Latin American Robotics Symposium (LARS) and 2017 Brazilian Symposium on Robotics (SBR), pp. 1–6. IEEE (2017)
Steffens, C.R., Huttner, V., Messias, L.R.V., Drews, P.L.J., Botelho, S.S.C., Guerra, R.S.: Cnn-based luminance and color correction for Ill-Exposed images. In: 2019 IEEE International Conference on Image Processing (ICIP), pp. 3252–3256. https://doi.org/10.1109/ICIP.2019.8803546 (2019)
Steffens, C.R., Messias, L.R.V., Drews, P.L.J., da Costa Botelho, S.S.: Can exposure, noise and compression affect image recognition? an assessment of the impacts on state-of-the-art convnets. In: 2019 Latin American Robotics Symposium (LARS), 2019 Brazilian Symposium on Robotics (SBR) and 2019 Workshop on Robotics in Education (WRE), pp. 61–66. IEEE (2019)
Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.A.: Inception-V4, Inception-Resnet and the impact of residual connections on learning. In: AAAI, vol. 4, p 12 (2017)
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)
Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., Fergus, R.: Intriguing properties of neural networks. arXiv:1312.6199 (2013)
Szeliski, R.: Computer Vision: Algorithms and Applications. Springer Science & Business Media, Berlin (2010)
Teso-Fz-Betoño, D., Zulueta, E., Sánchez-Chica, A., Fernandez-Gamiz, U., Saenz-Aguirre, A.: Semantic segmentation to develop an indoor navigation system for an autonomous mobile robot. Mathematics 8 (5), 855 (2020)
Verma, R., Ali, J.: A comparative study of various types of image noise and efficient noise removal techniques. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 3(10) (2013)
Voigtlaender, P., Krause, M., Osep, A., Luiten, J., Sekar, B.B.G., Geiger, A., Leibe, B.: Mots: Multi-object tracking and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7942–7951 (2019)
Vono, M., Dobigeon, N., Chainais, P.: Bayesian image restoration under poisson noise and log-concave prior. In: ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (IEEE), pp. 1712–1716. IEEE (2019)
van der Walt, S., Schönberger, J.L., Nunez-Iglesias, J., Boulogne, F., Warner, J.D., Yager, N., Gouillart, E., Yu, T.: The scikit-image contributors: Scikit-image: Image processing in Python. PeerJ 2, e453 (2014). https://doi.org/10.7717/peerj.453
Wang, P., Huang, X., Cheng, X., Zhou, D., Geng, Q., Yang, R.: The apolloscape open dataset for autonomous driving and its application. IEEE Trans. Pattern Anal. Machine Intell. (2019)
Weber, F., Rosa, G., Terra, F., Oldoni, A., Drew-Jr, P.: A low cost system to optimize pesticide application based on mobile technologies and computer vision. In: 2018 Latin American Robotic Symposium, 2018 Brazilian Symposium on Robotics (SBR) and 2018 Workshop on Robotics in Education (WRE), pp. 345–350 (2018)
Weis, A.́A., Mor, J.L., Soares, L.B., Steffens, C.R., Drews-Jr, P.L., de Faria, M.F., Evald, P.J., Azzolin, R.Z., Nelson Filho, D., Botelho, S.S.D.C.: Automated seam tracking system based on passive monocular vision for automated linear robotic welding process. In: 2017 IEEE 15th International Conference on Industrial Informatics (INDIN), pp. 305–310. IEEE (2017)
Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1492–1500 (2017)
Young, K.Y., Cheng, S.L., Ko, C.H., Tsou, H.W.: Development of a comfort-based motion guidance system for a robot walking helper. J. Intell. Robot. Syst. 1–10 (2020)
Zhang, H., Wu, C., Zhang, Z., Zhu, Y., Zhang, Z., Lin, H., Sun, Y., He, T., Mueller, J., Manmatha, R., et al.: Resnest: Split-attention networks. arXiv:2004.08955 (2020)
Zhang, J., Hirakawa, K.: Improved denoising via poisson mixture modeling of image sensor noise. IEEE Trans. Image Process. 26(4), 1565–1578 (2017)
Zhang, K., Zuo, W., Chen, Y., Meng, D., Zhang, L.: Beyond a Gaussian denoiser: Residual learning of deep CNN for image denoising. IEEE Trans. Image Process. 26(7), 3142–3155 (2017)
Zhang, W., Zhou, H., Sun, S., Wang, Z., Shi, J., Loy, C.C.: Robust multi-modality multi-object tracking. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2365–2374 (2019)
Zhang, Z., Zhang, X., Peng, C., Xue, X., Sun, J.: Exfuse: Enhancing feature fusion for semantic segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 269– 284 (2018)
Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8697–8710 (2018)
Acknowledgements
We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan X Pascal GPU used for this research.
Funding
This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001.
Author information
Authors and Affiliations
Contributions
Cristiano Rafael Steffens, Lucas Ricardo Vieira Messias, Paulo Lilles Jorge Drews-Jr, and Silvia Silva da Costa Botelho all contributed equally to the work.
Corresponding author
Ethics declarations
Conflict of Interests
The authors declare that they have no conflict of interest.
Consent to Publish
This research involved no human subjects.
Consent to Participate
This research required no study-specific approval by the appropriate ethics committee as it does not involve humans and/or animals.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Steffens, C.R., Messias, L.R.V., Drews-Jr, P.J.L. et al. On Robustness of Robotic and Autonomous Systems Perception. J Intell Robot Syst 101, 61 (2021). https://doi.org/10.1007/s10846-021-01334-0
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10846-021-01334-0