Skip to main content

Advertisement

Log in

Episode detection in videos captured using a head-mounted camera

  • Theoretical Advances
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

With the advent of wearable computing, personal imaging, photojournalism and personal video diaries, the need for automated archiving of the videos captured by them has become quite pressing. The principal device used to capture the human-environment interaction with these devices is a wearable camera (usually a head-mounted camera). The videos obtained from such a camera are raw and unedited versions of the visual interaction of the wearer (the user of the camera) with the surroundings. The focus of our research is to develop post-processing techniques that can automatically abstract videos based on episode detection. An episode is defined as a part of the video that was captured when the user was interested in an external event and paid attention to record it. Our research is based on the assumption that head movements have distinguishable patterns during an episode occurrence and these patterns can be exploited to differentiate between an episode and a non-episode. Here we present a novel algorithm exploiting the head and body behaviour for detecting the episodes. The algorithm’s performance is measured by comparing the ground truth (user-declared episodes) with the detected episodes. The experiments show the high degree of success we achieved with our proposed method on several hours of head-mounted video captured in varying locations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10 a

Similar content being viewed by others

References

  1. Lienhart R, Pfeiffer S, Effelsberg W (1997a) Video abstracting. Commun ACM 40(12):55–62

    Article  Google Scholar 

  2. Arman F, Depommier R, Hsu A, Chiu MY (1994) Content based browsing of video sequences. In: Proceedings of the ACM international conference on multimedia, pp 97–103

  3. Rorvig ME (1993) A method for automatically abstracting visual documents. J R Am Soc Inf Sci 44(1):40–56

    Google Scholar 

  4. Taniguchi Y, Akutsu A, Tonomura Y, Hamada H (1995) An intuitive and efficient access interface to real-time incoming video based on automatic indexing. In: Proceedings of the ACM international conference on multimedia, San Francisco, pp 25–33

  5. Tonomura Y, Akutsu A, Taniguchi Y, Suzuki G (1994) Structured video computing. IEEE Multimedia Mag 1(3):34–43

    Google Scholar 

  6. Yeung MM, Yeo BL, Wolf W, Liu B (1995) Video browsing using clustering and scene transitions on compressed sequences. In: Rodriguez AA, Maitan J (eds) Proceedings of SPIE, Multimedia Computing and Networking, San Jose, 2417:399–414

  7. Zhang H, Low CY, Smoliar SW, Wu JH (1995) Video parsing, retrieval and browsing: an integrated and content-based solution. In: Proceedings of the ACM international conference on multimedia, San Francisco, pp 15–24

  8. Zhang H, Smoliar SW, Wu JH (1995) Content based video browsing tools. In: Rodriguez AA, Maitan J (eds) Proceedings of SPIE, Multimedia Computing and Networking, San Jose, 2417:389–398

  9. Smith M, Kanade T (1995) Video skimming for quick browsing based on audio and image characterisation. Computer Science Technical Report, Carnegie Mellon University, Pittsburgh

  10. Pfeiffer S, Lienhart R, Fischer S, Effelsberg W (1996) Abstracting digital movies automatically. J Vis Commun Image Represent 7(4):345–353

    Article  Google Scholar 

  11. Lienhart R, Pfeiffer S, Fischer S (1997b) Automatic movie abstracting and its presentation on an HTML page. Technical Report TR-97–003, University of Mannheim, Germany

  12. Saarela J, Merialdo B (1999) Using content models to build audio-video summaries. In: Proceedings of SPIE 3656, Storage and Retrieval for Image and Video Databases VII, pp 338–347

  13. Lienhart R (1990) Abstracting home video automatically. In: Proceedings of ACM Multimedia 99 (Part 2):37–40, Orlando, FL

  14. Lienhart R (2000) Dynamic video summarization of home video. In: Proceedings of SPIE 3972, Storage and Retrieval for Media Databases  2000, January 2000, pp 378–389. Technical Report MRL-VIG99020, April 1999b

  15. Mann S (1998) WearCam (The Wearable Camera): Personal imaging for long-term use in wearable tetherless computer-mediated reality and personal photo/videographic memory prosthesis. In: Proceedings of the 2nd international symposium on wearable computers, pp 124–131

  16. Rowley HA, Baluja S, Kanade T (1995) Human face recognition in visual scenes. Technical Report, Carnegie Mellon University, CMU-CS-95–158R, School of Computer Science, Pittsburgh

  17. Nakamura Y, Ohde J, Ohta Y (200a) Structuring personal experiences- analyzing views from a head-mounted camera. In: Proceedings of the IEEE international conference on multimedia and expo, New York, pp 1137–1140

  18. Nakamura Y, Ohde J, Ohta Y (200b) Structuring personal activity records based on attention-analyzing videos from head mounted camera. In: Proceedings of the international conference on pattern recognitionBarcelona, pp 222–225

  19. Pilu M (2003) A method for real-time, robust frame-to-frame global motion estimation. HP Labs Technical Report HPL-2003–65 April 2003

  20. Fischler MA, Bolles RC (1981) Random sample consensus: a paradigm for model fitting applications to image analysis and automated cartography. Commun ACM 24:381–395

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sameer Singh.

Appendices

Appendix 1. Confusion matrices for stationary and non-stationary classification

structure 1

Appendix 2. Confusion matrices for head movement direction classification

structure 2

Appendix 3. Confusion matrices for head movement direction classification (Note: “not applicable” has been inserted because we cannot have the value for non-episodes that were detected as non-episodes.)

structure 3

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chauhan, A., Singh, S. & Grosvenor, D. Episode detection in videos captured using a head-mounted camera. Pattern Anal Applic 7, 176–189 (2004). https://doi.org/10.1007/s10044-004-0215-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-004-0215-4

Keywords

Navigation