Episode detection in videos captured using a head-mounted camera

Chauhan, Aneesh; Singh, Sameer; Grosvenor, Dave

doi:10.1007/s10044-004-0215-4

Episode detection in videos captured using a head-mounted camera

Theoretical Advances
Published: 19 June 2004

Volume 7, pages 176–189, (2004)
Cite this article

Pattern Analysis and Applications Aims and scope Submit manuscript

Aneesh Chauhan¹,
Sameer Singh¹ &
Dave Grosvenor²

83 Accesses
2 Citations
Explore all metrics

Abstract

With the advent of wearable computing, personal imaging, photojournalism and personal video diaries, the need for automated archiving of the videos captured by them has become quite pressing. The principal device used to capture the human-environment interaction with these devices is a wearable camera (usually a head-mounted camera). The videos obtained from such a camera are raw and unedited versions of the visual interaction of the wearer (the user of the camera) with the surroundings. The focus of our research is to develop post-processing techniques that can automatically abstract videos based on episode detection. An episode is defined as a part of the video that was captured when the user was interested in an external event and paid attention to record it. Our research is based on the assumption that head movements have distinguishable patterns during an episode occurrence and these patterns can be exploited to differentiate between an episode and a non-episode. Here we present a novel algorithm exploiting the head and body behaviour for detecting the episodes. The algorithm’s performance is measured by comparing the ground truth (user-declared episodes) with the detected episodes. The experiments show the high degree of success we achieved with our proposed method on several hours of head-mounted video captured in varying locations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automatic Event Detection in User-Generated Video Content: A Survey

HAR-CO: A comparative analytical review for recognizing conventional human activity in stream data relying on challenges and approaches

Article 10 October 2023

Video Analytics for Activity Recognition in Indoor Environments Using Fisheye Cameras

References

Lienhart R, Pfeiffer S, Effelsberg W (1997a) Video abstracting. Commun ACM 40(12):55–62
Article Google Scholar
Arman F, Depommier R, Hsu A, Chiu MY (1994) Content based browsing of video sequences. In: Proceedings of the ACM international conference on multimedia, pp 97–103
Rorvig ME (1993) A method for automatically abstracting visual documents. J R Am Soc Inf Sci 44(1):40–56
Google Scholar
Taniguchi Y, Akutsu A, Tonomura Y, Hamada H (1995) An intuitive and efficient access interface to real-time incoming video based on automatic indexing. In: Proceedings of the ACM international conference on multimedia, San Francisco, pp 25–33
Tonomura Y, Akutsu A, Taniguchi Y, Suzuki G (1994) Structured video computing. IEEE Multimedia Mag 1(3):34–43
Google Scholar
Yeung MM, Yeo BL, Wolf W, Liu B (1995) Video browsing using clustering and scene transitions on compressed sequences. In: Rodriguez AA, Maitan J (eds) Proceedings of SPIE, Multimedia Computing and Networking, San Jose, 2417:399–414
Zhang H, Low CY, Smoliar SW, Wu JH (1995) Video parsing, retrieval and browsing: an integrated and content-based solution. In: Proceedings of the ACM international conference on multimedia, San Francisco, pp 15–24
Zhang H, Smoliar SW, Wu JH (1995) Content based video browsing tools. In: Rodriguez AA, Maitan J (eds) Proceedings of SPIE, Multimedia Computing and Networking, San Jose, 2417:389–398
Smith M, Kanade T (1995) Video skimming for quick browsing based on audio and image characterisation. Computer Science Technical Report, Carnegie Mellon University, Pittsburgh
Pfeiffer S, Lienhart R, Fischer S, Effelsberg W (1996) Abstracting digital movies automatically. J Vis Commun Image Represent 7(4):345–353
Article Google Scholar
Lienhart R, Pfeiffer S, Fischer S (1997b) Automatic movie abstracting and its presentation on an HTML page. Technical Report TR-97–003, University of Mannheim, Germany
Saarela J, Merialdo B (1999) Using content models to build audio-video summaries. In: Proceedings of SPIE 3656, Storage and Retrieval for Image and Video Databases VII, pp 338–347
Lienhart R (1990) Abstracting home video automatically. In: Proceedings of ACM Multimedia 99 (Part 2):37–40, Orlando, FL
Lienhart R (2000) Dynamic video summarization of home video. In: Proceedings of SPIE 3972, Storage and Retrieval for Media Databases 2000, January 2000, pp 378–389. Technical Report MRL-VIG99020, April 1999b
Mann S (1998) WearCam (The Wearable Camera): Personal imaging for long-term use in wearable tetherless computer-mediated reality and personal photo/videographic memory prosthesis. In: Proceedings of the 2nd international symposium on wearable computers, pp 124–131
Rowley HA, Baluja S, Kanade T (1995) Human face recognition in visual scenes. Technical Report, Carnegie Mellon University, CMU-CS-95–158R, School of Computer Science, Pittsburgh
Nakamura Y, Ohde J, Ohta Y (200a) Structuring personal experiences- analyzing views from a head-mounted camera. In: Proceedings of the IEEE international conference on multimedia and expo, New York, pp 1137–1140
Nakamura Y, Ohde J, Ohta Y (200b) Structuring personal activity records based on attention-analyzing videos from head mounted camera. In: Proceedings of the international conference on pattern recognitionBarcelona, pp 222–225
Pilu M (2003) A method for real-time, robust frame-to-frame global motion estimation. HP Labs Technical Report HPL-2003–65 April 2003
Fischler MA, Bolles RC (1981) Random sample consensus: a paradigm for model fitting applications to image analysis and automated cartography. Commun ACM 24:381–395
Article Google Scholar

Download references

Author information

Authors and Affiliations

Autonomous Technologies Research, Department of Computer Science, University of Exeter, Exeter, EX4 4QF, UK
Aneesh Chauhan & Sameer Singh
Digital Media Department, Hewlett Packard Research Labs, Frenchay, Bristol, UK
Dave Grosvenor

Authors

Aneesh Chauhan
View author publications
You can also search for this author in PubMed Google Scholar
Sameer Singh
View author publications
You can also search for this author in PubMed Google Scholar
Dave Grosvenor
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sameer Singh.

Appendices

Appendix 1. Confusion matrices for stationary and non-stationary classification

Appendix 2. Confusion matrices for head movement direction classification

Appendix 3. Confusion matrices for head movement direction classification (Note: “not applicable” has been inserted because we cannot have the value for non-episodes that were detected as non-episodes.)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chauhan, A., Singh, S. & Grosvenor, D. Episode detection in videos captured using a head-mounted camera. Pattern Anal Applic 7, 176–189 (2004). https://doi.org/10.1007/s10044-004-0215-4

Download citation

Received: 01 October 2003
Accepted: 15 April 2004
Published: 19 June 2004
Issue Date: July 2004
DOI: https://doi.org/10.1007/s10044-004-0215-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Episode detection in videos captured using a head-mounted camera

Abstract

Access this article

Similar content being viewed by others

Automatic Event Detection in User-Generated Video Content: A Survey

HAR-CO: A comparative analytical review for recognizing conventional human activity in stream data relying on challenges and approaches

Video Analytics for Activity Recognition in Indoor Environments Using Fisheye Cameras

References