Skip to main content

Open Questions in Testing of Learned Computer Vision Functions for Automated Driving

  • Conference paper
  • First Online:
Book cover Computer Safety, Reliability, and Security (SAFECOMP 2019)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 11699))

Included in the following conference series:

Abstract

Vision is an important sensing modality in automated driving. Deep learning-based approaches have gained popularity for different computer vision (CV) tasks such as semantic segmentation and object detection. However, the black-box nature of deep neural nets (DNN) is a challenge for practical software verification. With this paper, we want to initiate a discussion in the academic community about research questions w.r.t. software testing of DNNs for safety-critical CV tasks. To this end, we provide an overview of related work from various domains, including software testing, machine learning and computer vision and derive a set of open research questions to start discussion between the fields.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Many verification techniques are up to now only applied to image classification. While this simpler CV task is not relevant for our application, the corresponding methods are good starting points for further study.

  2. 2.

    We discuss first steps in this direction in the context of synthetic data in Sec. 2.3.

  3. 3.

    With the discriminative model we could only dismiss generated irrelevant test inputs, while with a generative model, we could directly generate relevant tests.

References

  1. AIT Austrian Institute of Technology GmbH: CV-HAZOP VITRO. https://vitro-testing.com/cv-hazop/. Accessed 28 Mar 2019

  2. Alhaija, H.A., Mustikovela, S.K., Mescheder, L.M., Geiger, A., Rother, C.: Augmented reality meets computer vision: efficient data generation for urban driving scenes. J. Comput. Vis. 126(9), 961–972 (2018)

    Article  Google Scholar 

  3. Barr, E.T., Harman, M., McMinn, P., Shahbaz, M., Yoo, S.: The oracle problem in software testing: a survey. IEEE Trans. Software Eng. 41(5), 507–525 (2015)

    Article  Google Scholar 

  4. Borg, M., et al.: Safely entering the deep: a review of verification and validation for machine learning and a challenge elicitation in the automotive industry. arXiv preprint arXiv:1812.05389 (2018)

  5. Braiek, H.B., Khomh, F.: On testing machine learning programs. CoRR arXiv:abs/1812.02257 (2018)

  6. Carlson, A., Skinner, K.A., Vasudevan, R., Johnson-Roberson, M.: Sensor transfer: learning optimal sensor effect image augmentation for sim-to-real domain adaptation. IEEE Robot. Autom. Lett. 4(3), 2431–2438 (2019)

    Google Scholar 

  7. Cheng, C.H., Huang, C.H., Brunner, T., Hashemi, V.: Towards safety verification of direct perception neural networks. arXiv preprint arXiv:1904.04706 (2019)

  8. Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: CVPR 2016 (2016)

    Google Scholar 

  9. Deng, J., Dong, W., Socher, R., Li, L., Li, K., Li, F.: Imagenet: a large-scale hierarchical image database. In: IEEE CVPR 2009, pp. 248–255 (2009)

    Google Scholar 

  10. Franke, U.: 30 years fighting for robustness, June 2018. http://www.robustvision.net. Talk at Robust Vision Challenge (2018)

  11. Frénay, B., Kabán, A.: A comprehensive introduction to label noise. In: ESANN 2014 (2014)

    Google Scholar 

  12. Frtunikj, J., Fuerst, S.: Engineering safe machine learning for automated driving systems. In: 27th Safety-Critical Systems Symposium (2019)

    Google Scholar 

  13. Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: IEEE CVPR 2012, pp. 3354–3361 (2012)

    Google Scholar 

  14. Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: ICLR (2015)

    Google Scholar 

  15. Group, W.A.I.Q.W.: Standard for automotive system image quality. In: IEEE P2020, IEEE (2019)

    Google Scholar 

  16. Hendrycks, D., Dietterich, T.: Benchmarking neural network robustness to common corruptions and perturbations. arXiv preprint arXiv:1903.12261 (2019)

  17. Hutter, A.: Einsatz von Simulationsmodellen beim Test elektronischer Steuergeräte. In: Sax, E. (ed.) Automatisiertes Testen Eingebetteter Systeme in der Automobilindustrie. Hanser (2008)

    Google Scholar 

  18. Jung, A.: Image augmentation for machine learning experiments (2019). https://github.com/aleju/imgaug. Accessed 9 Apr 2019

  19. Kar, A., et al.: Meta-Sim: Learning to Generate Synthetic Datasets. arXiv e-prints arXiv:1904.11621, April 2019

  20. Karpathy, A.: A recipe for training neural networks, April 2019. http://karpathy.github.io/2019/04/25/recipe/. blog Accessed 2019 Apr 30

  21. Koopman, P., Fratrik, F.: How many operational design domains, objects, and events? In: Workshop on AI Safety @ AAAI 2019 (2019)

    Google Scholar 

  22. Koopman, P., Wagner, M.: Toward a framework for highly automated vehicle safety validation. Technical report, SAE Technical Paper (2018)

    Google Scholar 

  23. Liu, C., Arnon, T., Lazarus, C., Barrett, C., Kochenderfer, M.J.: Algorithms for verifying deep neural networks. CoRR arXiv:abs/1903.06758 (2019)

  24. Mayer, N., et al.: What makes good synthetic training data for learning disparity and optical flow estimation? Int. J. Comput. Vis. 126(9), 942–960 (2018)

    Article  Google Scholar 

  25. Meyer, B.: Soundness and completeness: with precision, April 2019. https://bertrandmeyer.com/2019/04/21/soundness-completeness-precision. blog Accessed 02 May 2019

  26. Pei, K., Cao, Y., Yang, J., Jana, S.: DeepXplore: automated whitebox testing of deep learning systems. In: Proceedings of SOSP 2017, pp. 1–18 (2017)

    Google Scholar 

  27. Pezzementi, Z., et al.: Putting image manipulations in context: robustness testing for safe perception. In: 2018 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR), pp. 1–8 (2018)

    Google Scholar 

  28. Poibrenski, A., Sprenger, J., Muller, C.: Toward a methodology for training with synthetic data on the example of pedestrian detection in a frame-by-frame semantic segmentation task. In: SEFAIAS@ICSE 2018, pp. 31–34 (2018)

    Google Scholar 

  29. Qin, C., et al.: Verification of non-linear specifications for neural networks. CoRR arXiv:abs/1902.09592 (2019)

  30. Ré, C.: Software 2.0 and snorkel: beyond hand-labeled data. In: Proceedings of 24th ACM KDD 2018, 19–23 August 2018, p. 2876 (2018)

    Google Scholar 

  31. SCSC: Data safety guidance. SCSC Version 3.1, The Safety-Critical Systems Club, York, Great Britain (2019)

    Google Scholar 

  32. Sen, K., Marinov, D., Agha, G.: CUTE: a concolic unit testing engine for c. In: ACM SIGSOFT Software Engineering Notes, vol. 30, pp. 263–272. ACM (2005)

    Google Scholar 

  33. Seshia, S.A., et al.: Formal specification for deep neural networks. In: Lahiri, S.K., Wang, C. (eds.) ATVA 2018. LNCS, vol. 11138, pp. 20–34. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01090-4_2

    Chapter  Google Scholar 

  34. Shetty, R., Schiele, B., Fritz, M.: Not using the car to see the sidewalk: quantifying and controlling the effects of context in classification and segmentation. CoRR arXiv:abs/1812.06707 (2018)

  35. Singh, G., Gehr, T., Püschel, M., Vechev, M.T.: An abstract domain for certifying neural networks. PACMPL 3(POPL), 41:1–41:30 (2019)

    Google Scholar 

  36. Sun, Y., Wu, M., Ruan, W., Huang, X., Kwiatkowska, M., Kroening, D.: Concolic testing for deep neural networks. In: Proceedings of ASE 2018, pp. 109–119 (2018)

    Google Scholar 

  37. Tian, Y., Pei, K., Jana, S., Ray, B.: DeepTest: automated testing of deep-neural-network-driven autonomous cars. In: arXiv:1708.08559 (2017)

  38. Wong, E., Schmidt, F.R., Metzen, J.H., Kolter, J.Z.: Scaling provable adversarial defenses. In: NeurIPS 2018, pp. 8410–8419 (2018)

    Google Scholar 

  39. Xie, X., Ho, J.W.K., Murphy, C., Kaiser, G.E., Xu, B., Chen, T.Y.: Testing and validating machine learning classifiers by metamorphic testing. J. Syst. Softw. 84(4), 544–558 (2011)

    Article  Google Scholar 

  40. Yuille, A.L., Liu, C.: Deep nets: What have they ever done for vision? arXiv preprint arXiv:1805.04025 (2018)

  41. Zendel, O., Honauer, K., Murschitz, M., Steininger, D., Domínguez, G.F.: WildDash - creating hazard-aware benchmarks. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11210, pp. 407–421. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01231-1_25

    Chapter  Google Scholar 

  42. Zendel, O., Murschitz, M., Humenberger, M., Herzner, W.: CV-HAZOP: introducing test data validation for computer vision. In: ICCV 2015, pp. 2066–2074 (2015)

    Google Scholar 

  43. Zhang, M., Zhang, Y., Zhang, L., Liu, C., Khurshid, S.: DeepRoad: GAN-based metamorphic testing and input validation framework for autonomous driving systems. In: ASE 2018, pp. 132–142 (2018)

    Google Scholar 

  44. Zhang, Y., Qiu, W., Chen, Q., Hu, X., Yuille, A.L.: UnrealStereo: controlling hazardous factors to analyze stereo vision. In: Proceedings of 3DV 2018, pp. 228–237 (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christoph Gladisch .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Woehrle, M., Gladisch, C., Heinzemann, C. (2019). Open Questions in Testing of Learned Computer Vision Functions for Automated Driving. In: Romanovsky, A., Troubitsyna, E., Gashi, I., Schoitsch, E., Bitsch, F. (eds) Computer Safety, Reliability, and Security. SAFECOMP 2019. Lecture Notes in Computer Science(), vol 11699. Springer, Cham. https://doi.org/10.1007/978-3-030-26250-1_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-26250-1_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-26249-5

  • Online ISBN: 978-3-030-26250-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics