Skip to main content
Log in

Building 3D event logs for video investigation

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In scene investigation, creating a video log captured using a handheld camera is more convenient and more complete than taking photos and notes. By introducing video analysis and computer vision techniques, it is possible to build a spatio-temporal representation of the investigation. Such a representation gives a better overview than a set of photos and makes an investigation more accessible. We develop such methods and present an interface for navigating the result. The processing includes (i) segmenting a log into events using novel structure and motion features making the log easier to access in the time dimension, and (ii) mapping video frames to a 3D model of the scene so the log can be navigated in space. Our results show that, using our proposed features, we can recognize more than 70 percent of all frames correctly, and more importantly find all the events. From there we provide a method to semi-interactively map those events to a 3D model of the scene. With this we can map more than 80 percent of the events. The result is a 3D event log that captures the investigation and supports applications such as revisiting the scene, examining the investigation itself, or hypothesis testing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. 1 http://opencv.willowgarage.com

  2. 2 http://www.cs.waikato.ac.nz/ml/weka

References

  1. Abdollahian G, Taskiran CM, Pizlo Z, Delp EJ (2010) Camera motion-based analysis of user generated video. IEEE Trans Multimed 12(1):28–41

    Article  Google Scholar 

  2. Aizawa K (2005) Digitizing personal experiences: capture and retrieval of life log In: MMM ’05: Proceedings of the 11th international multimedia modelling conference, pp 10–15

  3. Albiol A, Torrest L, Delpt EJ (2003) The indexing of persons in news sequences using audio-visual data In: IEEE international conference on acoustic, speech, and signal processing

  4. Bijhold J, Ruifrok A, Jessen M, Geradts Z, Ehrhardt S, Alberink I (2007) Forensic audio and visual evidence 2004–2007: a review. 15th INTERPOL forensic science symposium

  5. Bush V (1945) As we may think. The atlantic

  6. Dang TK, Worring M, Bui TD (2011) A semi-interactive panorama based 3D reconstruction framework for indoor scenes. Comp Vision Image Underst 115: 1516–1524

    Article  Google Scholar 

  7. Dickie C, Vertegaal R, Fono D, Sohn C, Chen D, Cheng D, Shell JS, Aoudeh O (2004) Augmenting and sharing memory with eyeblog In: CARPE’04: Proceedings of the the 1st ACM workshop on continuous archival and retrieval of personal experiences, pp 105–109

  8. Doherty AR, Smeaton AF (2008) Automatically segmenting lifelog data into events In: WIAMIS ’08: Proceedings of the 2008 9th international workshop on image analysis for multimedia interactive services, pp 20–23

  9. Doherty AR, Smeaton AF, Lee K, Ellis DPW (2007) Multimodal segmentation of lifelog data In: Proceedings of RIAO 2007. Pittsburgh

  10. Gemmell J, Williams L, Wood K, Lueder R, Bell G (2004) Passive capture and ensuing issues for a personal lifetime store In: CARPE’04: Proceedings of the the 1st ACM workshop on continuous archival and retrieval of personal experiences, pp 48–55

  11. Gibson S, Hubbold RJ, Cook J, Howard TLJ (2003) Interactive reconstruction of virtual environments from video sequences. Comput Graph 27(2):293–301

    Article  Google Scholar 

  12. Goldman DB, Gonterman C, Curless B, Salesin D, Seitz SM (2008) Video object annotation, navigation, and composition In: UIST ’08: Proceedings of the 21st annual ACM symposium on user interface software and technology, pp 3–12

  13. Hartley R, Zisserman A (2004) Multiple view geometry in computer vision, 2nd edn. Cambridge University Press

  14. Howard TLJ, Murta AD, Gibson S (2000) Virtual environments for scene of crime reconstruction and analysis In: SPIE – visual data exploration and analysis VII, vol 3960, pp 1–8

  15. Kang HW, Shin SY (2002) Tour into the video: image-based navigation scheme for video sequences of dynamic scenes In: VRST ’02: Proceedings of the ACM symposium on virtual reality software and technology, pp 73–80

  16. Kim K, Essa I, Abowd GD (2006) Interactive mosaic generation for video navigation In: MULTIMEDIA ’06: Proceedings of the 14th annual ACM international conference on multimedia, pp 655–658

  17. Lan DJ, Ma YF, Zhang HJ (2003) A novel motion-based representation for video mining In: International conference on multimedia and expo, vol 3, pp 469–472

  18. Lowe DG (1999) Object recognition from local scale-invariant features In: International conference on computer vision, vol 2, pp 1150–1157

  19. Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110

    Article  Google Scholar 

  20. Ma YF, Lu L, Zhang HJ, Li M (2003) A user attention model for video summarization In: ACM multimedia, pp 533–542

  21. Mei T, Hua XS, Zhou HQ, Li S (2007) Modeling and mining of users’ capture intention for home video. IEEE Trans Multimed 9(1)

  22. Meur OL, Thoreau D, Callet PL, Barba D (2005) A spatial-temporal model of the selective human visual attention In: International conference on image processing, vol 3, pp 1188–1191

  23. Ngo CW, Pong TC, Zhang H (2002) Motion-based video representation for scene change detection. Int J Comput Vis 50(2):127–142

    Article  MATH  Google Scholar 

  24. Pollefeys M, Van Gool L, Vergauwen M, Verbiest F, Cornelis K, Tops J, Koch R (2004) Visual modeling with a hand-held camera. Int J Comput Vis 59:207–232

    Article  Google Scholar 

  25. Pollefeys M, Verbiest F, Van Gool L (2002) Surviving dominant planes in uncalibrated structure and motion recovery In: European conference on computer vision, pp 837–851

  26. Robinson D, Milanfar P (2003) Fast local and global projection-based methods for affine motion estimation. J Math Imaging Vis 8(1):35–54

    Article  MathSciNet  Google Scholar 

  27. Rui Y, Gupta A, Acero A (2000) Automatically extracting highlights for TV baseball program In: ACM multimedia, pp 105–115

  28. Sinha SN, Steedly D, Szeliski R, Agrawala M, Pollefeys M (2008) Interactive 3D architectural modeling from unordered photo collections. ACM Trans Graph 27(5):159

    Article  Google Scholar 

  29. Sivic J, Zisserman A (2009) Efficient visual search of videos cast as text retrieval. IEEE Trans Pattern Anal Mach Intell 31(4):591–606

    Article  Google Scholar 

  30. Snavely N, Seitz SM, Szeliski R (2006) Photo tourism: exploring photo collections in 3D. ACM Trans Graph 25(3):835–846

    Article  Google Scholar 

  31. Snavely N, Seitz SM, Szeliski R (2008) Modeling the world from internet photo collections. Int J Comput Vis 80(2):189–210

    Article  Google Scholar 

  32. Snoek CGM, Worring M (2009) Concept-based video retrieval. Found Trends Inf Retr 4(2):215–322

    Google Scholar 

  33. Tancharoen D, Yamasaki T, Aizawa K (2005) Practical experience recording and indexing of life log video In: CARPE ’05: Proceedings of the 2nd ACM workshop on continuous archival and retrieval of personal experiences, pp 61–66

  34. Torr P, Fitzgibbon AW, Zisserman A (1999) The problem of degeneracy in structure and motion recovery from uncalibrated image sequences. Int. J. Comput. Vis. 32(1)

  35. van den Hengel A, Dick A, Thormählen T, Ward B, Torr PHS (2007) VideoTrace: rapid interactive scene modelling from video. ACM Trans Graph 26(3):86

    Article  Google Scholar 

Download references

Acknowledgments

We thank Jurrien Bijhold and the Netherlands Forensic Institute for providing the data and bringing in domain knowledge, and the police investigators for participating in the experiment. This work is supported by the Research Grant from Vietnam’s National Foundation for Science and Technology Development (NAFOSTED), No. 102.02-2011.13.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Trung Kien Dang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dang, T.K., Worring, M. & Bui, T.D. Building 3D event logs for video investigation. Multimed Tools Appl 74, 4617–4639 (2015). https://doi.org/10.1007/s11042-013-1826-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-013-1826-9

Keywords

Navigation