Skip to main content

Towards Automated Testing and Robustification by Semantic Adversarial Data Generation

  • Conference paper
  • First Online:
Computer Vision – ECCV 2020 (ECCV 2020)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12347))

Included in the following conference series:

  • 5499 Accesses

Abstract

Widespread application of computer vision systems in real world tasks is currently hindered by their unexpected behavior on unseen examples. This occurs due to limitations of empirical testing on finite test sets and lack of systematic methods to identify the breaking points of a trained model. In this work we propose semantic adversarial editing, a method to synthesize plausible but difficult data points on which our target model breaks down. We achieve this with a differentiable object synthesizer which can change an object’s appearance while retaining its pose. Constrained adversarial optimization of object appearance through this synthesizer produces rare/difficult versions of an object which fool the target object detector. Experiments show that our approach effectively synthesizes difficult test data, dropping the performance of YoloV3 detector by more than 20 mAP points by changing the appearance of a single object and discovering failure modes of the model. The generated semantic adversarial data can also be used to robustify the detector through data augmentation, consistently improving its performance in both standard and out-of-dataset-distribution test sets, across three different datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Alaifari, R., Alberti, G.S., Gauksson, T.: ADef: an iterative algorithm to construct adversarial deformations. In: Proceedings of the International Conference on Learning Representations (ICLR) (2019)

    Google Scholar 

  2. Alcorn, M.A., et al.: Strike (with) a pose: neural networks are easily fooled by strange poses of familiar objects. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019

    Google Scholar 

  3. Azulay, A., Weiss, Y.: Why do deep convolutional networks generalize so poorly to small image transformations? J. Mach. Learn. Res. (JMLR) 20(184), 1–25 (2019)

    MathSciNet  MATH  Google Scholar 

  4. Che, Z., et al.: D2-city: a large-scale dashcam video dataset of diverse traffic scenarios. arXiv preprint arXiv:1904.01975 (2019)

  5. Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: learning augmentation strategies from data. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 113–123 (2019)

    Google Scholar 

  6. Dumont, B., Maggio, S., Montalvo, P.: Robustness of rotation-equivariant networks to adversarial perturbations. arXiv preprint arXiv:1802.06627 (2018)

  7. Dvornik, N., Mairal, J., Schmid, C.: Modeling visual context is key to augmenting object detection datasets. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11216, pp. 375–391. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01258-8_23

    Chapter  Google Scholar 

  8. Dwibedi, D., Misra, I., Hebert, M.: Cut, paste and learn: Surprisingly easy synthesis for instance detection. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 1301–1310 (2017)

    Google Scholar 

  9. Engstrom, L., Tran, B., Tsipras, D., Schmidt, L., Madry, A.: Exploring the landscape of spatial robustness. In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings of the International Conference on Machine Learning (ICML), vol. 97, pp. 1802–1811. PMLR, Long Beach, California, USA, 09–15 June 2019

    Google Scholar 

  10. Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL visual object classes challenge 2012 (VOC2012) Results. http://www.pascal-network.org/challenges/VOC/voc2012/workshop/index.html

  11. Geirhos, R., Rubisch, P., Michaelis, C., Bethge, M., Wichmann, F.A., Brendel, W.: Imagenet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. In: Proceedings of the International Conference on Learning Representations (ICLR) (2019)

    Google Scholar 

  12. Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems (NeurIPS) (2014)

    Google Scholar 

  13. Hamdi, A., Ghanem, B.: Towards analyzing semantic robustness of deep neural networks. In: Proceedings of the IEEE International Conference on Computer Vision Workshops (ICCV Workshops) (2019)

    Google Scholar 

  14. Hendrycks, D., Dietterich, T.: Benchmarking neural network robustness to common corruptions and perturbations. In: Proceedings of the International Conference on Learning Representations (ICLR) (2019)

    Google Scholar 

  15. Hendrycks, D., Zhao, K., Basart, S., Steinhardt, J., Song, D.: Natural adversarial examples. arXiv preprint arXiv:1907.07174 (2019)

  16. Hosseini, H., Poovendran, R.: Semantic adversarial examples. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPR Workshops), pp. 1614–1619 (2018)

    Google Scholar 

  17. Jakab, T., Gupta, A., Bilen, H., Vedaldi, A.: Unsupervised learning of object landmarks through conditional image generation. In: Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R. (eds.) Advances in Neural Information Processing Systems (NeurIPS), pp. 4016–4027. Curran Associates, Inc. (2018)

    Google Scholar 

  18. Jang, E., Gu, S., Poole, B.: Categorical reparameterization with gumbel-softmax. In: Proceedings of the International Conference on Learning Representations (ICLR) (2016)

    Google Scholar 

  19. Li, Y.J., Lin, C.S., Lin, Y.B., Wang, Y.C.F.: Cross-dataset person re-identification via unsupervised pose disentanglement and adaptation. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 7919–7929 (2019)

    Google Scholar 

  20. Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48

    Chapter  Google Scholar 

  21. Liu, H.T.D., Tao, M., Li, C.L., Nowrouzezahrai, D., Jacobson, A.: Beyond pixel norm-balls: parametric adversaries using an analytically differentiable renderer. In: Proceedings of the International Conference on Learning Representations (ICLR) (2019)

    Google Scholar 

  22. Lorenz, D., Bereska, L., Milbich, T., Ommer, B.: Unsupervised part-based disentangling of object shape and appearance. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10947–10956 (2019)

    Google Scholar 

  23. Maddison, C.J., Mnih, A., Teh, Y.W.: The concrete distribution: a continuous relaxation of discrete random variables (2016)

    Google Scholar 

  24. Michaelis, C., et al.: Benchmarking robustness in object detection: autonomous driving when winter is coming. arXiv preprint arXiv:1907.07484 (2019)

  25. Park, T., Liu, M.Y., Wang, T.C., Zhu, J.Y.: Semantic image synthesis with spatially-adaptive normalization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)

    Google Scholar 

  26. Peyre, J., Laptev, I., Schmid, C., Sivic, J.: Weakly-supervised learning of visual relations. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2017)

    Google Scholar 

  27. Recht, B., Roelofs, R., Schmidt, L., Shankar, V.: Do cifar-10 classifiers generalize to cifar-10? arXiv preprint arXiv:1806.00451 (2018)

  28. Recht, B., Roelofs, R., Schmidt, L., Shankar, V.: Do ImageNet classifiers generalize to ImageNet? In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings of the International Conference on Machine Learning (ICML), vol. 97, pp. 5389–5400. PMLR, Long Beach, California, USA, 09–15 June 2019

    Google Scholar 

  29. Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)

  30. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28

    Chapter  Google Scholar 

  31. Rosenfeld, A., Zemel, R., Tsotsos, J.K.: The elephant in the room. arXiv preprint arXiv:1808.03305 (2018)

  32. Shankar, V., Dave, A., Roelofs, R., Ramanan, D., Recht, B., Schmidt, L.: A systematic framework for natural perturbations from videos. In: Proceedings of the International Conference on Machine Learning Workshops (ICML Workshop) (2019)

    Google Scholar 

  33. Shetty, R., Fritz, M., Schiele, B.: Adversarial scene editing: automatic object removal from weak supervision. In: Advances in Neural Information Processing Systems (NeurIPS) (2018)

    Google Scholar 

  34. Shetty, R., Schiele, B., Fritz, M.: Not using the car to see the sidewalk-quantifying and controlling the effects of context in classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8218–8226 (2019)

    Google Scholar 

  35. Siarohin, A., Lathuilière, S., Tulyakov, S., Ricci, E., Sebe, N.: Animating arbitrary objects via deep motion transfer. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2372–2381 (2019)

    Google Scholar 

  36. Song, Y., Shu, R., Kushman, N., Ermon, S.: Constructing unrestricted adversarial examples with generative models. In: Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R., (eds.) Advances in Neural Information Processing Systems (NeurIPS), pp. 8312–8323. Curran Associates, Inc. (2018)

    Google Scholar 

  37. Stutz, D., Hein, M., Schiele, B.: Disentangling adversarial robustness and generalization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6976–6987 (2019)

    Google Scholar 

  38. Tripathi, S., Chandra, S., Agrawal, A., Tyagi, A., Rehg, J.M., Chari, V.: Learning to generate synthetic data via compositing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 461–470 (2019)

    Google Scholar 

  39. Ultralytics: pytorch implementation of YoloV3 (2019). https://github.com/ultralytics/yolov3. Accessed 11 Nov 2019

  40. Wang, H.: Implentation of data augmentation for object detection via progressive and selective instance-switching (2019). https://github.com/Hwang64/PSIS. Accessed 11 Nov 2019

  41. Wang, H., Wang, Q., Yang, F., Zhang, W., Zuo, W.: Data augmentation for object detection via progressive and selective instance-switching. arXiv preprint arXiv:1906.00358 (2019)

  42. Wang, X., Shrivastava, A., Gupta, A.: A-fast-RCNN: hard positive generation via adversary for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2606–2615 (2017)

    Google Scholar 

  43. Xiao, C., Zhu, J.Y., Li, B., He, W., Liu, M., Song, D.: Spatially transformed adversarial examples. In: Proceedings of the International Conference on Learning Representations (ICLR) (2018)

    Google Scholar 

  44. Yu, F., et al.: BDD100k: a diverse driving dataset for heterogeneous multitask learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2636–2645 (2020)

    Google Scholar 

  45. Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2017)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rakshith Shetty .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 13429 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Shetty, R., Fritz, M., Schiele, B. (2020). Towards Automated Testing and Robustification by Semantic Adversarial Data Generation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12347. Springer, Cham. https://doi.org/10.1007/978-3-030-58536-5_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58536-5_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58535-8

  • Online ISBN: 978-3-030-58536-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics