Building 3D event logs for video investigation

Dang, Trung Kien; Worring, Marcel; Bui, The Duy

doi:10.1007/s11042-013-1826-9

Building 3D event logs for video investigation

Published: 11 January 2014

Volume 74, pages 4617–4639, (2015)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Trung Kien Dang^1,2,
Marcel Worring¹ &
The Duy Bui²

151 Accesses
1 Citation
Explore all metrics

Abstract

In scene investigation, creating a video log captured using a handheld camera is more convenient and more complete than taking photos and notes. By introducing video analysis and computer vision techniques, it is possible to build a spatio-temporal representation of the investigation. Such a representation gives a better overview than a set of photos and makes an investigation more accessible. We develop such methods and present an interface for navigating the result. The processing includes (i) segmenting a log into events using novel structure and motion features making the log easier to access in the time dimension, and (ii) mapping video frames to a 3D model of the scene so the log can be navigated in space. Our results show that, using our proposed features, we can recognize more than 70 percent of all frames correctly, and more importantly find all the events. From there we provide a method to semi-interactively map those events to a 3D model of the scene. With this we can map more than 80 percent of the events. The result is a 3D event log that captures the investigation and supports applications such as revisiting the scene, examining the investigation itself, or hypothesis testing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey of methods for time series change point detection

Article 08 September 2016

HOTA: A Higher Order Metric for Evaluating Multi-object Tracking

Article Open access 08 October 2020

A new YOLO-based method for real-time crowd detection from video and performance analysis of YOLO models

Article 30 January 2023

Notes

References

Abdollahian G, Taskiran CM, Pizlo Z, Delp EJ (2010) Camera motion-based analysis of user generated video. IEEE Trans Multimed 12(1):28–41
Article Google Scholar
Aizawa K (2005) Digitizing personal experiences: capture and retrieval of life log In: MMM ’05: Proceedings of the 11th international multimedia modelling conference, pp 10–15
Albiol A, Torrest L, Delpt EJ (2003) The indexing of persons in news sequences using audio-visual data In: IEEE international conference on acoustic, speech, and signal processing
Bijhold J, Ruifrok A, Jessen M, Geradts Z, Ehrhardt S, Alberink I (2007) Forensic audio and visual evidence 2004–2007: a review. 15th INTERPOL forensic science symposium
Bush V (1945) As we may think. The atlantic
Dang TK, Worring M, Bui TD (2011) A semi-interactive panorama based 3D reconstruction framework for indoor scenes. Comp Vision Image Underst 115: 1516–1524
Article Google Scholar
Dickie C, Vertegaal R, Fono D, Sohn C, Chen D, Cheng D, Shell JS, Aoudeh O (2004) Augmenting and sharing memory with eyeblog In: CARPE’04: Proceedings of the the 1st ACM workshop on continuous archival and retrieval of personal experiences, pp 105–109
Doherty AR, Smeaton AF (2008) Automatically segmenting lifelog data into events In: WIAMIS ’08: Proceedings of the 2008 9th international workshop on image analysis for multimedia interactive services, pp 20–23
Doherty AR, Smeaton AF, Lee K, Ellis DPW (2007) Multimodal segmentation of lifelog data In: Proceedings of RIAO 2007. Pittsburgh
Gemmell J, Williams L, Wood K, Lueder R, Bell G (2004) Passive capture and ensuing issues for a personal lifetime store In: CARPE’04: Proceedings of the the 1st ACM workshop on continuous archival and retrieval of personal experiences, pp 48–55
Gibson S, Hubbold RJ, Cook J, Howard TLJ (2003) Interactive reconstruction of virtual environments from video sequences. Comput Graph 27(2):293–301
Article Google Scholar
Goldman DB, Gonterman C, Curless B, Salesin D, Seitz SM (2008) Video object annotation, navigation, and composition In: UIST ’08: Proceedings of the 21st annual ACM symposium on user interface software and technology, pp 3–12
Hartley R, Zisserman A (2004) Multiple view geometry in computer vision, 2nd edn. Cambridge University Press
Howard TLJ, Murta AD, Gibson S (2000) Virtual environments for scene of crime reconstruction and analysis In: SPIE – visual data exploration and analysis VII, vol 3960, pp 1–8
Kang HW, Shin SY (2002) Tour into the video: image-based navigation scheme for video sequences of dynamic scenes In: VRST ’02: Proceedings of the ACM symposium on virtual reality software and technology, pp 73–80
Kim K, Essa I, Abowd GD (2006) Interactive mosaic generation for video navigation In: MULTIMEDIA ’06: Proceedings of the 14th annual ACM international conference on multimedia, pp 655–658
Lan DJ, Ma YF, Zhang HJ (2003) A novel motion-based representation for video mining In: International conference on multimedia and expo, vol 3, pp 469–472
Lowe DG (1999) Object recognition from local scale-invariant features In: International conference on computer vision, vol 2, pp 1150–1157
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Article Google Scholar
Ma YF, Lu L, Zhang HJ, Li M (2003) A user attention model for video summarization In: ACM multimedia, pp 533–542
Mei T, Hua XS, Zhou HQ, Li S (2007) Modeling and mining of users’ capture intention for home video. IEEE Trans Multimed 9(1)
Meur OL, Thoreau D, Callet PL, Barba D (2005) A spatial-temporal model of the selective human visual attention In: International conference on image processing, vol 3, pp 1188–1191
Ngo CW, Pong TC, Zhang H (2002) Motion-based video representation for scene change detection. Int J Comput Vis 50(2):127–142
Article MATH Google Scholar
Pollefeys M, Van Gool L, Vergauwen M, Verbiest F, Cornelis K, Tops J, Koch R (2004) Visual modeling with a hand-held camera. Int J Comput Vis 59:207–232
Article Google Scholar
Pollefeys M, Verbiest F, Van Gool L (2002) Surviving dominant planes in uncalibrated structure and motion recovery In: European conference on computer vision, pp 837–851
Robinson D, Milanfar P (2003) Fast local and global projection-based methods for affine motion estimation. J Math Imaging Vis 8(1):35–54
Article MathSciNet Google Scholar
Rui Y, Gupta A, Acero A (2000) Automatically extracting highlights for TV baseball program In: ACM multimedia, pp 105–115
Sinha SN, Steedly D, Szeliski R, Agrawala M, Pollefeys M (2008) Interactive 3D architectural modeling from unordered photo collections. ACM Trans Graph 27(5):159
Article Google Scholar
Sivic J, Zisserman A (2009) Efficient visual search of videos cast as text retrieval. IEEE Trans Pattern Anal Mach Intell 31(4):591–606
Article Google Scholar
Snavely N, Seitz SM, Szeliski R (2006) Photo tourism: exploring photo collections in 3D. ACM Trans Graph 25(3):835–846
Article Google Scholar
Snavely N, Seitz SM, Szeliski R (2008) Modeling the world from internet photo collections. Int J Comput Vis 80(2):189–210
Article Google Scholar
Snoek CGM, Worring M (2009) Concept-based video retrieval. Found Trends Inf Retr 4(2):215–322
Google Scholar
Tancharoen D, Yamasaki T, Aizawa K (2005) Practical experience recording and indexing of life log video In: CARPE ’05: Proceedings of the 2nd ACM workshop on continuous archival and retrieval of personal experiences, pp 61–66
Torr P, Fitzgibbon AW, Zisserman A (1999) The problem of degeneracy in structure and motion recovery from uncalibrated image sequences. Int. J. Comput. Vis. 32(1)
van den Hengel A, Dick A, Thormählen T, Ward B, Torr PHS (2007) VideoTrace: rapid interactive scene modelling from video. ACM Trans Graph 26(3):86
Article Google Scholar

Download references

Acknowledgments

We thank Jurrien Bijhold and the Netherlands Forensic Institute for providing the data and bringing in domain knowledge, and the police investigators for participating in the experiment. This work is supported by the Research Grant from Vietnam’s National Foundation for Science and Technology Development (NAFOSTED), No. 102.02-2011.13.

Author information

Authors and Affiliations

University of Amsterdam, Amsterdam, The Netherlands
Trung Kien Dang & Marcel Worring
University of Engineering and Technology, Vietnam National University Hanoi, Hanoi, Vietnam
Trung Kien Dang & The Duy Bui

Authors

Trung Kien Dang
View author publications
You can also search for this author in PubMed Google Scholar
Marcel Worring
View author publications
You can also search for this author in PubMed Google Scholar
The Duy Bui
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Trung Kien Dang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dang, T.K., Worring, M. & Bui, T.D. Building 3D event logs for video investigation. Multimed Tools Appl 74, 4617–4639 (2015). https://doi.org/10.1007/s11042-013-1826-9

Download citation

Published: 11 January 2014
Issue Date: July 2015
DOI: https://doi.org/10.1007/s11042-013-1826-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Building 3D event logs for video investigation

Abstract

Access this article

Similar content being viewed by others

A survey of methods for time series change point detection

HOTA: A Higher Order Metric for Evaluating Multi-object Tracking

A new YOLO-based method for real-time crowd detection from video and performance analysis of YOLO models

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Building 3D event logs for video investigation

Abstract

Access this article

Similar content being viewed by others

A survey of methods for time series change point detection

HOTA: A Higher Order Metric for Evaluating Multi-object Tracking

A new YOLO-based method for real-time crowd detection from video and performance analysis of YOLO models

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation