Abstract
Machine learning requires data, but acquiring and labeling real-world data is challenging, expensive, and time-consuming. More importantly, it is nearly impossible to alter real data post-acquisition (e.g., change the illumination of a room), making it very difficult to measure how specific properties of the data affect performance. In this paper, we present AI Playground (AIP), an open-source, Unreal Engine-based tool for generating and labeling virtual image data. With AIP, it is trivial to capture the same image under different conditions (e.g., fidelity, lighting, etc.) and with different ground truths (e.g., depth or surface normal values). AIP is easily extendable and can be used with or without code. To validate our proposed tool, we generated eight datasets of otherwise identical but varying lighting and fidelity conditions. We then trained deep neural networks to predict (1) depth values, (2) surface normals, or (3) object labels and assessed each network’s intra- and cross-dataset performance. Among other insights, we verified that sensitivity to different settings is problem-dependent. We confirmed the findings of other studies that segmentation models are very sensitive to fidelity, but we also found that they are just as sensitive to lighting. In contrast, depth and normal estimation models seem to be less sensitive to fidelity or lighting, and more sensitive to the structure of the image. Finally, we tested our trained depth-estimation networks on two real-world datasets and obtained results comparable to training on real data alone, confirming that our virtual environments are realistic enough for real-world tasks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Photo-manipulation techniques can be used to alter images, but their effects are either non-specific (e.g., reducing brightness) or introduce unwanted artifacts. They also require significant human effort.
- 2.
Source code, documentation, images, and high-definition figures are available on our GitHub page: https://git.io/JJkhQ.
References
Alhashim, I., Wonka, P.: High quality monocular depth estimation via transfer learning. arXiv e-prints arXiv:1812.11941 (2018)
Community, B.O.: Blender - a 3D modelling and rendering package. Blender Foundation, Stichting Blender Foundation, Amsterdam (2018). http://www.blender.org
Eigen, D., Puhrsch, C., Fergus, R.: Depth map prediction from a single image using a multi-scale deep network. CoRR abs/1406.2283 (2014)
Epic Games: Unreal engine. https://www.unrealengine.com
Gaidon, A., Wang, Q., Cabon, Y., Vig, E.: Virtual worlds as proxy for multi-object tracking analysis (2016)
Haltakov, V., Unger, C., Ilic, S.: Framework for generation of synthetic ground truth data for driver assistance applications. In: Weickert, J., Hein, M., Schiele, B. (eds.) GCPR 2013. LNCS, vol. 8142, pp. 323–332. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40602-7_35
Khanal, A., Estrada, R.: Dynamic deep networks for retinal vessel segmentation. CoRR abs/1903.07803 (2019)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015). https://doi.org/10.1038/nature14539
Merrick, L.: Randomized ablation feature importance. arXiv e-prints arXiv:1910.00174 (2019)
Meyes, R., Lu, M., Waubert de Puiseau, C., Meisen, T.: Ablation studies in artificial neural networks. arXiv e-prints arXiv:1901.08644 (2019)
Mnih, V., et al.: Playing atari with deep reinforcement learning. arXiv e-prints arXiv:1312.5602 (2013)
Silberman, N., Hoiem, D., Kohli, P., Fergus, R.: Indoor segmentation and support inference from RGBD images. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7576, pp. 746–760. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33715-4_54
Qiu, W., et al.: UnrealCV: virtual worlds for computer vision. ACM Multimedia Open Source Software Competition (2017)
Richter, S.R., Vineet, V., Roth, S., Koltun, V.: Playing for data: ground truth from computer games. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 102–118. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_7
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Savva, M., et al.: Habitat: a platform for embodied AI research. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2019)
Vasiljevic, I., et al.: DIODE: a dense indoor and outdoor DEpth dataset. CoRR abs/1908.00463 (2019)
Veeravasarapu, V., Rothkopf, C., Visvanathan, R.: Model-driven simulations for computer vision. In: 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1063–1071 (2017)
Zeng, J., et al.: Deep surface normal estimation with hierarchical RGB-D fusion. CoRR abs/1904.03405 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Mousavi, M., Khanal, A., Estrada, R. (2020). AI Playground: Unreal Engine-Based Data Ablation Tool for Deep Learning. In: Bebis, G., et al. Advances in Visual Computing. ISVC 2020. Lecture Notes in Computer Science(), vol 12510. Springer, Cham. https://doi.org/10.1007/978-3-030-64559-5_41
Download citation
DOI: https://doi.org/10.1007/978-3-030-64559-5_41
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-64558-8
Online ISBN: 978-3-030-64559-5
eBook Packages: Computer ScienceComputer Science (R0)