Skip to main content

Densely Annotated Photorealistic Virtual Dataset Generation for Abnormal Event Detection

  • Conference paper
  • First Online:
Pattern Recognition. ICPR International Workshops and Challenges (ICPR 2021)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12664))

Included in the following conference series:

  • 2735 Accesses

Abstract

Many timely computer vision problems, such as crowd event detection, individual or crowd activity recognition, person detection and re-identification, tracking, pose estimation, segmentation, require pixel-level annotations. This involves significant manual effort, and is likely to face challenges related to the privacy of individuals, due to the intrinsic nature of these problems, requiring in-depth identifying information. To cover the gap in the field and address these issues, we introduce and make publicly available a photorealistic, synthetically generated dataset, with detailed dense annotations. We also publish the tool we developed to generate it, that will allow users to not only use our dataset, but expand upon it by building their own densely annotated videos for many other computer vision problems. We demonstrate the usefulness of the dataset with experiments on unsupervised crowd anomaly detection in various scenarios, environments, lighting, weather conditions. Our dataset and the annotations provided with it allow its use in numerous other computer vision problems, such as pose estimation, person detection, segmentation, re-identification and tracking, individual and crowd activity recognition, and abnormal event detection. We present the dataset as is, along with the source code and tool to generate it, so any modification can be made and new data can be created. To our knowledge, there is currently no other photorealistic, densely annotated, realistic, synthetically generated dataset for abnormal crowd event detection, nor one that allows for flexibility of use by allowing the creation of new data with annotations for many other computer vision problems. Dataset and source code available: https://github.com/RicoMontulet/GTA5Event.

Funded under the H2020 project MindSpaces, Grant number # 825079.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/RicoMontulet/GTA5Event.

References

  1. How Google’s DeepMind will train its AI inside Unity’s video game worlds (2018). https://web.archive.org/web/20180927024638/www.fastcompany.com/90240010/deepminds-ai-will-learn-inside-unitys-video-game-worlds

  2. Policy on posting copyrighted rockstar games material (Oct 2020). https://tinyurl.com/RockstarPrivacyPolicy. Accessed 15 Sept 2020

  3. Unity Machine Learning Agents (2020). https://unity.com/products/machine-learning-agents

  4. Adam, A., Rivlin, E., Shimshoni, I., Reinitz, D.: Robust real-time unusual event detection using multiple fixed-location monitors. IEEE Trans. Pattern Anal. Mach. Intell. 30(3), 555–560 (2008)

    Article  Google Scholar 

  5. Andrade, E., Fisher, B.: Simulation of crowd problems for computer vision. In: 1st International Workshop on Crowd Simulation (V-CROWDS ’05), pp. 71–80 (2005)

    Google Scholar 

  6. Bąk, S., Carr, P., Lalonde, J.-F.: Domain adaptation through synthesis for unsupervised person re-identification. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018, Part XIII. LNCS, vol. 11217, pp. 193–209. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01261-8_12

    Chapter  Google Scholar 

  7. Barron, J.L., Fleet, D.J., Beauchemin, S.S.: Performance of optical flow techniques. Int. J. Comput. Vis. 12, 43–77 (1994)

    Article  Google Scholar 

  8. Basseville, M., Nikiforov, I.: Detection of Abrupt Changes: Theory and Application. Prentice-Hall Inc., Englewood Cliffs (1993)

    MATH  Google Scholar 

  9. Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: OpenPose: realtime multi-person 2D pose estimation using part affinity fields. IEEE Trans. Pattern Anal. Mach. Intell. 43(1), 172–186 (2019)

    Article  Google Scholar 

  10. Cartucho, J., Tukra, S., Li, Y., Elson, D.S., Giannarou, S.: VisionBlender: a tool to efficiently generate computer vision datasets for robotic surgery. In: Computer Methods in Biomechanics and Biomedical Engineering: Imaging and Visualization (2020)

    Google Scholar 

  11. Community, B.O.: Blender - a 3D modelling and rendering package. Blender Foundation, Stichting Blender Foundation, Amsterdam (2018). http://www.blender.org

  12. De Souza, C.R., Gaidon, A., Cabon, Y., Lpez, A.M.: Procedural generation of videos to train deep action recognition networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2594–2604 (2017)

    Google Scholar 

  13. Denninger, M., et al.: BlenderProc: reducing the reality gap with photorealistic rendering. In: Robotics: Science and Systems (RSS) Workshops (2020)

    Google Scholar 

  14. Doan, A.D., Jawaid, A.M., Do, T.T., Chin, T.J.: G2D: from GTA to Data (2018)

    Google Scholar 

  15. Dosovitskiy, A., et al.: FlowNet: learning optical flow with convolutional networks. In: IEEE International Conference on Computer Vision (ICCV), pp. 2758–2766 (2015)

    Google Scholar 

  16. Dosovitskiy, A., Ros, G., Codevilla, F., Lopez, A., Koltun, V.: CARLA: an open urban driving simulator. In: 1st Annual Conference on Robot Learning, pp. 1–16 (2017)

    Google Scholar 

  17. Dworak, D., Ciepiela, F., Derbisz, J., Izzat, I., Komorkiewicz, M., Wjcik, M.: Performance of LiDAR object detection deep learning architectures based on artificially generated point cloud data from CARLA simulator. In: 2019 24th International Conference on Methods and Models in Automation and Robotics (MMAR), pp. 600–605 (2019)

    Google Scholar 

  18. Einmahl, J., McKeague, I.: Empirical likelihood based hypothesis testing. Bernoulli 9, 267–290 (2003)

    Article  MathSciNet  Google Scholar 

  19. Elanattil, S., Moghadam, P.: Synthetic human model dataset for skeleton driven non-rigid motion tracking and 3D reconstruction (2019)

    Google Scholar 

  20. Gyuri, I.: Europilot: A toolkit for controlling Euro Truck Simulator 2 with Python to develop self-driving algorithms (2017). https://github.com/marshq/europilot

  21. Heeger, D.J.: Model for the extraction of image flow. J. Opt. Soc. Am. A 4(8), 1455–1471 (1987)

    Article  Google Scholar 

  22. Hoffman, J., et al.: CyCADA: cycle-consistent adversarial domain adaptation. In: International Conference on Machine Learning, pp. 1989–1998 (Jul 2018)

    Google Scholar 

  23. Huang, X., Liu, M.-Y., Belongie, S., Kautz, J.: Multimodal unsupervised image-to-image translation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018, Part III. LNCS, vol. 11207, pp. 179–196. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01219-9_11

    Chapter  Google Scholar 

  24. Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., Brox, T.: FlowNet 2.0: evolution of optical flow estimation with deep networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1647–1655 (2017)

    Google Scholar 

  25. Johnson-Roberson, M., Barto, C., Mehta, R., Sridhar, S.N., Rosaen, K., Vasudevan, R.: Driving in the matrix: Can virtual worlds replace human-generated annotations for real world tasks? arXiv preprint arXiv:1610.01983 (2016)

  26. Juliani, A., et al.: Unity: a general platform for intelligent agents. arXiv preprint arXiv:1809.02627 (2020). https://github.com/Unity-Technologies/ml-agents

  27. Li, W., Mahadevan, V., Vasconcelos, N.: Anomaly detection and localization in crowded scenes. IEEE Trans. Pattern Anal. Mach. Intell. 36(1), 18–32 (2014)

    Article  Google Scholar 

  28. Liu, M., Breuel, T., Kautz, J.: Unsupervised image-to-image translation networks. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, Red Hook, NY, USA, pp. 700–708 (2017)

    Google Scholar 

  29. Lu, C., Shi, J., Jia, J.: Abnormal event detection at 150 fps in matlab. In: Proceedings of the 2013 IEEE International Conference on Computer Vision, ICCV ’13, IEEE Computer Society, USA, pp. 2720–2727 (2013)

    Google Scholar 

  30. Mayer, N., et al.: What makes good synthetic training data for learning disparity and optical flow estimation? Int. J. Comput. Vis. 126, 942–960 (2018)

    Article  Google Scholar 

  31. Oghaz, M.M., Argyriou, V., Remagnino, P.: Learning how to analyse crowd behaviour using synthetic data. In: Proceedings of the 32nd International Conference on Computer Animation and Social Agents, pp. 11–14 (2019)

    Google Scholar 

  32. Page, E.S.: Continuous inspection scheme. Biometrika 41, 100–115 (1954)

    Article  MathSciNet  Google Scholar 

  33. Pollok, T., Junglas, L., Ruf, B., Schumann, A.: UnrealGT: using unreal engine to generate ground truth datasets. In: Bebis, G., et al. (eds.) ISVC 2019, Part I. LNCS, vol. 11844, pp. 670–682. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-33720-9_52

    Chapter  Google Scholar 

  34. Qiu, W., et al.: UnrealCV: virtual worlds for computer vision. In: ACM Multimedia Open Source Software Competition (2017)

    Google Scholar 

  35. Ramachandra, B., Jones, M.J.: Street scene: a new dataset and evaluation protocol for video anomaly detection. In: IEEE Winter Conference on Applications of Computer Vision, WACV 2020, Snowmass Village, CO, USA, 1–5 March 2020 (2020)

    Google Scholar 

  36. Ranjan, A., Hoffmann, D.T., Tzionas, D., Tang, S., Romero, J., Black, M.J.: Learning multi-human optical flow. Int. J. Comput. Vis. (IJCV) 128, 873–890 (2020). http://humanflow.is.tue.mpg.de

    Article  Google Scholar 

  37. Richter, S.R., Hayder, Z., Koltun, V.: Playing for benchmarks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2213–2222 (2017)

    Google Scholar 

  38. Richter, S.R., Vineet, V., Roth, S., Koltun, V.: Playing for data: ground truth from computer games. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016, Part II. LNCS, vol. 9906, pp. 102–118. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_7

    Chapter  Google Scholar 

  39. Ros, G., Sellart, L., Materzynska, J., Vazquez, D., Lopez, A.M.: The SYNTHIA dataset: a large collection of synthetic images for semantic segmentation of urban scenes. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3234–3243 (2016)

    Google Scholar 

  40. Ros, G., Sellart, L., Materzynska, J., Vazquez, D., Lopez, A.M.: The SYNTHIA dataset: a large collection of synthetic images for semantic segmentation of urban scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3234–3243 (2016)

    Google Scholar 

  41. Saleh, F.S., Aliakbarian, M.S., Salzmann, M., Petersson, L., Alvarez, J.M.: Effective use of synthetic data for urban scene semantic segmentation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018, Part II. LNCS, vol. 11206, pp. 86–103. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01216-8_6

    Chapter  Google Scholar 

  42. Shah, S., Dey, D., Lovett, C., Kapoor, A.: Airsim: high-fidelity visual and physical simulation for autonomous vehicles. In: Field and Service Robotics (2017). https://arxiv.org/abs/1705.05065

  43. Song, S., Yu, F., Zeng, A., Chang, A.X., Savva, M., Funkhouser, T.: Semantic scene completion from a single depth image. In: Proceedings of 30th IEEE Conference on Computer Vision and Pattern Recognition (2017)

    Google Scholar 

  44. Tremblay, J., et al.: Training deep networks with synthetic data: bridging the reality gap by domain randomization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (June 2018)

    Google Scholar 

  45. Tremblay, J., To, T., Birchfield, S.: Falling things: a synthetic dataset for 3D object detection and pose estimation. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 2119–21193 (2018)

    Google Scholar 

  46. UMN: University of Minnesota dataset. http://mha.cs.umn.edu/proj_events.shtml

  47. Varol, G., et al.: Learning from synthetic humans. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (July 2017)

    Google Scholar 

  48. Wang, Q., Gao, J., Lin, W., Yuan, Y.: Learning from synthetic data for crowd counting in the wild. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8198–8207 (2019)

    Google Scholar 

  49. Xiang, S., Fu, Y., You, G., Liu, T.: Attribute analysis with synthetic dataset for person re-identification. arXiv preprint:2006.07139 (2020)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexia Briassouli .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Montulet, R., Briassouli, A. (2021). Densely Annotated Photorealistic Virtual Dataset Generation for Abnormal Event Detection. In: Del Bimbo, A., et al. Pattern Recognition. ICPR International Workshops and Challenges. ICPR 2021. Lecture Notes in Computer Science(), vol 12664. Springer, Cham. https://doi.org/10.1007/978-3-030-68799-1_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-68799-1_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-68798-4

  • Online ISBN: 978-3-030-68799-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics