Abstract
We propose a brand new benchmark for analyzing causality in traffic accident videos by decomposing an accident into a pair of events, cause and effect. We collect videos containing traffic accident scenes and annotate cause and effect events for each accident with their temporal intervals and semantic labels; such annotations are not available in existing datasets for accident anticipation task. Our dataset has the following two advantages over the existing ones, which would facilitate practical research for causality analysis. First, the decomposition of an accident into cause and effect events provides atomic cues for reasoning on a complex environment and planning future actions. Second, the prediction of cause and effect in an accident makes a system more interpretable to humans, which mitigates the ambiguity of legal liabilities among agents engaged in the accident. Using the proposed dataset, we analyze accidents by localizing the temporal intervals of their causes and effects and classifying the semantic labels of the accidents. The dataset as well as the implementations of baseline models are available in the code repository (https://github.com/tackgeun/CausalityInTrafficAccident).
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
2004 General Estimates System (GES) crash database [1] contains a nationally representative sample of police reports dealing with all types of a vehicle crash.
- 2.
References
National automotive sampling system (NASS) general estimates system (GES) analytical user’s manual, pp. 1988–2004 (2005). https://one.nhtsa.gov/Data/National-Automotive-Sampling-System-(NASS)
Aliakbarian, M.S., Saleh, F.S., Salzmann, M., Fernando, B., Petersson, L., Andersson, L.: VIENA\(^2\): a driving anticipation dataset. In: Jawahar, C.V., Li, H., Mori, G., Schindler, K. (eds.) ACCV 2018. LNCS, vol. 11361, pp. 449–466. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20887-5_28
Buch, S., Escorcia, V., Shen, C., Ghanem, B., Niebles, J.C.: SST: Single-stream temporal action proposals. In: CVPR (2017)
Carreira, J., Zisserman, A.: Quo vadis, action recognition? a new model and the kinetics dataset. In: CVPR (2017)
Chan, F.-H., Chen, Y.-T., Xiang, Y., Sun, M.: Anticipating accidents in dashcam videos. In: Lai, S.-H., Lepetit, V., Nishino, K., Sato, Y. (eds.) ACCV 2016. LNCS, vol. 10114, pp. 136–153. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-54190-7_9
Chao, Y.W., Vijayanarasimhan, S., Seybold, B., Ross, D.A., Deng, J., Sukthankar, R.: Rethinking the faster R-CNN architecture for temporal action localization. In: CVPR (2018)
Cho, K., van Merrienboer, B., Gulcehre, C., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: EMNLP (2014)
Farha, Y.A., Gall, J.: Ms-tcn: Multi-stage temporal convolutional network for action segmentation. In: CVPR (2019)
Feichtenhofer, C., Pinz, A., Zisserman, A.: Convolutional two-stream network fusion for video action recognition. In: CVPR (2016)
Hara, K., Kataoka, H., Satoh, Y.: Can spatiotemporal 3D CNNs retrace the history of 2D CNNs and imagenet? In: CVPR, pp. 6546–6555 (2018)
Herzig, R., et al.: Spatio-temporal action graph networks. In: ICCVW (2019)
Kataoka, H., Suzuki, T., Oikawa, S., Matsui, Y., Satoh, Y.: Drive video analysis for the detection of traffic near-miss incidents. In: ICRA (2018)
Kim, H., Lee, K., Hwang, G., Suh, C.: Crash to not Crash: learn to identify dangerous vehicles using a simulator. In: AAAI (2019)
Lea, C., Flynn, M.D., Vidal, R., Reiter, A., Hager, G.D.: Temporal convolutional networks for action segmentation and detection. In: CVPR (2017)
Lebeda, K., Hadfield, S., Bowden, R.: Exploring causal relationships in visual object tracking. In: ICCV (2015)
Lopez-Paz, D., Nishihara, R., Chintala, S., Schölkopf, B., Bottou, L.: Discovering causal signals in images. In: CVPR (2017)
Najm, W.G., Smith, J.D., Yanagisawa, M.: Pre-crash scenario typology for crash avoidance research (2007). https://rosap.ntl.bts.gov/view/dot/6281
Pickup, L.C., et al.: Seeing the arrow of time. In: CVPR (2014)
Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: NIPS (2014)
Suzuki, T., Kataoka, H., Aoki, Y., Satoh, Y.: Anticipating traffic accidents with adaptive loss and large-scale incident db. In: CVPR (2018)
Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3d convolutional networks. In: ICCV (2015)
Wang, L., et al.: Temporal segment networks: towards good practices for deep action recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 20–36. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_2
Wei, D., Lim, J., Zisserman, A., Freeman, W.T.: Learning and using the arrow of time. In: CVPR (2018)
Xu, H., Das, A., Saenko, K.: R-c3d: region convolutional 3D network for temporal activity detection. In: ICCV (2017)
Yao, Y., Xu, M., Wang, Y., Crandall, D.J., Atkins, E.M.: Unsupervised traffic accident detection in first-person videos. In: IROS (2019)
Zeng, K.H., Chou, S.H., Chan, F.H., Niebles, J.C., Sun, M.: Agent-centric risk assessment: accident anticipation and risky region localization. In: CVPR (2017)
Zeng, R., et al.: Graph convolutional networks for temporal action localization. In: ICCV (2019)
Zhao, Y., Xiong, Y., Wang, L., Wu, Z., Tang, X., Lin, D.: Temporal action detection with structured segment networks. In: ICCV (2017)
Acknowledgement
This work was supported by Institute for Information & Communications Technology Promotion (IITP) grant funded by the Korea government (MSIT) [2017-0-01780, 2017-0-01779] and Microsoft Research Asia. We also appreciate Jonghwan Mun and Ilchae Jung for valuable discussion.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
You, T., Han, B. (2020). Traffic Accident Benchmark for Causality Recognition. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12352. Springer, Cham. https://doi.org/10.1007/978-3-030-58571-6_32
Download citation
DOI: https://doi.org/10.1007/978-3-030-58571-6_32
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58570-9
Online ISBN: 978-3-030-58571-6
eBook Packages: Computer ScienceComputer Science (R0)