Densely Annotated Photorealistic Virtual Dataset Generation for Abnormal Event Detection

Montulet, Rico; Briassouli, Alexia

doi:10.1007/978-3-030-68799-1_1

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12664))

Included in the following conference series:

International Conference on Pattern Recognition

2735 Accesses

Abstract

Many timely computer vision problems, such as crowd event detection, individual or crowd activity recognition, person detection and re-identification, tracking, pose estimation, segmentation, require pixel-level annotations. This involves significant manual effort, and is likely to face challenges related to the privacy of individuals, due to the intrinsic nature of these problems, requiring in-depth identifying information. To cover the gap in the field and address these issues, we introduce and make publicly available a photorealistic, synthetically generated dataset, with detailed dense annotations. We also publish the tool we developed to generate it, that will allow users to not only use our dataset, but expand upon it by building their own densely annotated videos for many other computer vision problems. We demonstrate the usefulness of the dataset with experiments on unsupervised crowd anomaly detection in various scenarios, environments, lighting, weather conditions. Our dataset and the annotations provided with it allow its use in numerous other computer vision problems, such as pose estimation, person detection, segmentation, re-identification and tracking, individual and crowd activity recognition, and abnormal event detection. We present the dataset as is, along with the source code and tool to generate it, so any modification can be made and new data can be created. To our knowledge, there is currently no other photorealistic, densely annotated, realistic, synthetically generated dataset for abnormal crowd event detection, nor one that allows for flexibility of use by allowing the creation of new data with annotations for many other computer vision problems. Dataset and source code available: https://github.com/RicoMontulet/GTA5Event.

Funded under the H2020 project MindSpaces, Grant number # 825079.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Pixel-Wise Crowd Understanding via Synthetic Data

Article 30 August 2020

VisDrone-CC2020: The Vision Meets Drone Crowd Counting Challenge Results

Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding

Notes

1.
https://github.com/RicoMontulet/GTA5Event.

References

How Google’s DeepMind will train its AI inside Unity’s video game worlds (2018). https://web.archive.org/web/20180927024638/www.fastcompany.com/90240010/deepminds-ai-will-learn-inside-unitys-video-game-worlds
Policy on posting copyrighted rockstar games material (Oct 2020). https://tinyurl.com/RockstarPrivacyPolicy. Accessed 15 Sept 2020
Unity Machine Learning Agents (2020). https://unity.com/products/machine-learning-agents
Adam, A., Rivlin, E., Shimshoni, I., Reinitz, D.: Robust real-time unusual event detection using multiple fixed-location monitors. IEEE Trans. Pattern Anal. Mach. Intell. 30(3), 555–560 (2008)
Article Google Scholar
Andrade, E., Fisher, B.: Simulation of crowd problems for computer vision. In: 1st International Workshop on Crowd Simulation (V-CROWDS ’05), pp. 71–80 (2005)
Google Scholar
Bąk, S., Carr, P., Lalonde, J.-F.: Domain adaptation through synthesis for unsupervised person re-identification. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018, Part XIII. LNCS, vol. 11217, pp. 193–209. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01261-8_12
Chapter Google Scholar
Barron, J.L., Fleet, D.J., Beauchemin, S.S.: Performance of optical flow techniques. Int. J. Comput. Vis. 12, 43–77 (1994)
Article Google Scholar
Basseville, M., Nikiforov, I.: Detection of Abrupt Changes: Theory and Application. Prentice-Hall Inc., Englewood Cliffs (1993)
MATH Google Scholar
Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: OpenPose: realtime multi-person 2D pose estimation using part affinity fields. IEEE Trans. Pattern Anal. Mach. Intell. 43(1), 172–186 (2019)
Article Google Scholar
Cartucho, J., Tukra, S., Li, Y., Elson, D.S., Giannarou, S.: VisionBlender: a tool to efficiently generate computer vision datasets for robotic surgery. In: Computer Methods in Biomechanics and Biomedical Engineering: Imaging and Visualization (2020)
Google Scholar
Community, B.O.: Blender - a 3D modelling and rendering package. Blender Foundation, Stichting Blender Foundation, Amsterdam (2018). http://www.blender.org
De Souza, C.R., Gaidon, A., Cabon, Y., Lpez, A.M.: Procedural generation of videos to train deep action recognition networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2594–2604 (2017)
Google Scholar
Denninger, M., et al.: BlenderProc: reducing the reality gap with photorealistic rendering. In: Robotics: Science and Systems (RSS) Workshops (2020)
Google Scholar
Doan, A.D., Jawaid, A.M., Do, T.T., Chin, T.J.: G2D: from GTA to Data (2018)
Google Scholar
Dosovitskiy, A., et al.: FlowNet: learning optical flow with convolutional networks. In: IEEE International Conference on Computer Vision (ICCV), pp. 2758–2766 (2015)
Google Scholar
Dosovitskiy, A., Ros, G., Codevilla, F., Lopez, A., Koltun, V.: CARLA: an open urban driving simulator. In: 1st Annual Conference on Robot Learning, pp. 1–16 (2017)
Google Scholar
Dworak, D., Ciepiela, F., Derbisz, J., Izzat, I., Komorkiewicz, M., Wjcik, M.: Performance of LiDAR object detection deep learning architectures based on artificially generated point cloud data from CARLA simulator. In: 2019 24th International Conference on Methods and Models in Automation and Robotics (MMAR), pp. 600–605 (2019)
Google Scholar
Einmahl, J., McKeague, I.: Empirical likelihood based hypothesis testing. Bernoulli 9, 267–290 (2003)
Article MathSciNet Google Scholar
Elanattil, S., Moghadam, P.: Synthetic human model dataset for skeleton driven non-rigid motion tracking and 3D reconstruction (2019)
Google Scholar
Gyuri, I.: Europilot: A toolkit for controlling Euro Truck Simulator 2 with Python to develop self-driving algorithms (2017). https://github.com/marshq/europilot
Heeger, D.J.: Model for the extraction of image flow. J. Opt. Soc. Am. A 4(8), 1455–1471 (1987)
Article Google Scholar
Hoffman, J., et al.: CyCADA: cycle-consistent adversarial domain adaptation. In: International Conference on Machine Learning, pp. 1989–1998 (Jul 2018)
Google Scholar
Huang, X., Liu, M.-Y., Belongie, S., Kautz, J.: Multimodal unsupervised image-to-image translation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018, Part III. LNCS, vol. 11207, pp. 179–196. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01219-9_11
Chapter Google Scholar
Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., Brox, T.: FlowNet 2.0: evolution of optical flow estimation with deep networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1647–1655 (2017)
Google Scholar
Johnson-Roberson, M., Barto, C., Mehta, R., Sridhar, S.N., Rosaen, K., Vasudevan, R.: Driving in the matrix: Can virtual worlds replace human-generated annotations for real world tasks? arXiv preprint arXiv:1610.01983 (2016)
Juliani, A., et al.: Unity: a general platform for intelligent agents. arXiv preprint arXiv:1809.02627 (2020). https://github.com/Unity-Technologies/ml-agents
Li, W., Mahadevan, V., Vasconcelos, N.: Anomaly detection and localization in crowded scenes. IEEE Trans. Pattern Anal. Mach. Intell. 36(1), 18–32 (2014)
Article Google Scholar
Liu, M., Breuel, T., Kautz, J.: Unsupervised image-to-image translation networks. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, Red Hook, NY, USA, pp. 700–708 (2017)
Google Scholar
Lu, C., Shi, J., Jia, J.: Abnormal event detection at 150 fps in matlab. In: Proceedings of the 2013 IEEE International Conference on Computer Vision, ICCV ’13, IEEE Computer Society, USA, pp. 2720–2727 (2013)
Google Scholar
Mayer, N., et al.: What makes good synthetic training data for learning disparity and optical flow estimation? Int. J. Comput. Vis. 126, 942–960 (2018)
Article Google Scholar
Oghaz, M.M., Argyriou, V., Remagnino, P.: Learning how to analyse crowd behaviour using synthetic data. In: Proceedings of the 32nd International Conference on Computer Animation and Social Agents, pp. 11–14 (2019)
Google Scholar
Page, E.S.: Continuous inspection scheme. Biometrika 41, 100–115 (1954)
Article MathSciNet Google Scholar
Pollok, T., Junglas, L., Ruf, B., Schumann, A.: UnrealGT: using unreal engine to generate ground truth datasets. In: Bebis, G., et al. (eds.) ISVC 2019, Part I. LNCS, vol. 11844, pp. 670–682. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-33720-9_52
Chapter Google Scholar
Qiu, W., et al.: UnrealCV: virtual worlds for computer vision. In: ACM Multimedia Open Source Software Competition (2017)
Google Scholar
Ramachandra, B., Jones, M.J.: Street scene: a new dataset and evaluation protocol for video anomaly detection. In: IEEE Winter Conference on Applications of Computer Vision, WACV 2020, Snowmass Village, CO, USA, 1–5 March 2020 (2020)
Google Scholar
Ranjan, A., Hoffmann, D.T., Tzionas, D., Tang, S., Romero, J., Black, M.J.: Learning multi-human optical flow. Int. J. Comput. Vis. (IJCV) 128, 873–890 (2020). http://humanflow.is.tue.mpg.de
Article Google Scholar
Richter, S.R., Hayder, Z., Koltun, V.: Playing for benchmarks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2213–2222 (2017)
Google Scholar
Richter, S.R., Vineet, V., Roth, S., Koltun, V.: Playing for data: ground truth from computer games. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016, Part II. LNCS, vol. 9906, pp. 102–118. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_7
Chapter Google Scholar
Ros, G., Sellart, L., Materzynska, J., Vazquez, D., Lopez, A.M.: The SYNTHIA dataset: a large collection of synthetic images for semantic segmentation of urban scenes. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3234–3243 (2016)
Google Scholar
Ros, G., Sellart, L., Materzynska, J., Vazquez, D., Lopez, A.M.: The SYNTHIA dataset: a large collection of synthetic images for semantic segmentation of urban scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3234–3243 (2016)
Google Scholar
Saleh, F.S., Aliakbarian, M.S., Salzmann, M., Petersson, L., Alvarez, J.M.: Effective use of synthetic data for urban scene semantic segmentation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018, Part II. LNCS, vol. 11206, pp. 86–103. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01216-8_6
Chapter Google Scholar
Shah, S., Dey, D., Lovett, C., Kapoor, A.: Airsim: high-fidelity visual and physical simulation for autonomous vehicles. In: Field and Service Robotics (2017). https://arxiv.org/abs/1705.05065
Song, S., Yu, F., Zeng, A., Chang, A.X., Savva, M., Funkhouser, T.: Semantic scene completion from a single depth image. In: Proceedings of 30th IEEE Conference on Computer Vision and Pattern Recognition (2017)
Google Scholar
Tremblay, J., et al.: Training deep networks with synthetic data: bridging the reality gap by domain randomization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (June 2018)
Google Scholar
Tremblay, J., To, T., Birchfield, S.: Falling things: a synthetic dataset for 3D object detection and pose estimation. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 2119–21193 (2018)
Google Scholar
UMN: University of Minnesota dataset. http://mha.cs.umn.edu/proj_events.shtml
Varol, G., et al.: Learning from synthetic humans. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (July 2017)
Google Scholar
Wang, Q., Gao, J., Lin, W., Yuan, Y.: Learning from synthetic data for crowd counting in the wild. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8198–8207 (2019)
Google Scholar
Xiang, S., Fu, Y., You, G., Liu, T.: Attribute analysis with synthetic dataset for person re-identification. arXiv preprint:2006.07139 (2020)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Data Science and Knowledge Engineering, Maastricht University, Maastricht, The Netherlands
Rico Montulet & Alexia Briassouli

Authors

Rico Montulet
View author publications
You can also search for this author in PubMed Google Scholar
Alexia Briassouli
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alexia Briassouli .

Editor information

Editors and Affiliations

Dipartimento di Ingegneria dell'Informazione, University of Firenze, Florence, Firenze, Italy
Alberto Del Bimbo
Dipartimento di Ingegneria “Enzo Ferrari”, Università di Modena e Reggio Emilia, Modena, Italy
Rita Cucchiara
Department of Computer Science, Boston University, Boston, MA, USA
Stan Sclaroff
Dipartimento di Matematica e Informatica, University of Catania, Catania, Catania, Italy
Giovanni Maria Farinella
Cloud & AI, JD.COM, Beijing, China
Tao Mei
Dipartimento di Ingegneria dell’Informazione, University of Firenze, Firenze, Italy
Marco Bertini
Computational Sciences Department, National Institute of Astrophysics, Optics and Electronics (INAOE), Tonantzintla, Puebla, Mexico
Hugo Jair Escalante
Dipartimento di Ingegneria “Enzo Ferrari”, Università di Modena e Reggio Emilia, Modena, Italy
Roberto Vezzani

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Montulet, R., Briassouli, A. (2021). Densely Annotated Photorealistic Virtual Dataset Generation for Abnormal Event Detection. In: Del Bimbo, A., et al. Pattern Recognition. ICPR International Workshops and Challenges. ICPR 2021. Lecture Notes in Computer Science(), vol 12664. Springer, Cham. https://doi.org/10.1007/978-3-030-68799-1_1

Download citation

DOI: https://doi.org/10.1007/978-3-030-68799-1_1
Published: 05 March 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-68798-4
Online ISBN: 978-3-030-68799-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)