skip to main content
10.1145/1452392.1452448acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
poster

Automated sip detection in naturally-evoked video

Published: 20 October 2008 Publication History

Abstract

Quantifying consumer experiences is an emerging application area for event detection in video. This paper presents a hierarchical model for robust sip detection that combines bottom-up processing of face videos, namely real-time head action unit analysis and and head gesture recognition, with top-down knowledge about sip events and task semantics. Our algorithm achieves an average accuracy of 82% in videos that feature single sips, and an average accuracy of 78% and false positive rate of 0.3%, in more challenging videos that feature multiple sips and chewing actions. We discuss the generality of our methodology to detecting other events in similar contexts.

References

[1]
L. Bai, S. Lao, W. Zhang, G. Jones, and A. Smeaton. A Semantic Event Detection Approach for Soccer Video based on Perception Concepts and Finite State Machines. Image Analysis for Multimedia Interactive Services, 2007. WIAMIS'07. Eighth International Workshop on, pages 30--30, 2007.
[2]
P. Ekman and W. V. Friesen. Facial Action Coding System: A Technique for the Measurement of Facial Movement. Consulting Psychologists, 1978. FaceTracker. Facial Feature Tracking SDK. Google (formerly Nevenvision), 2006.
[3]
M. Fleischman. Unsupervised content-based indexing of sports video. Proceedings of the international workshop on Workshop on multimedia information retrieval, pages 87--94, 2007.
[4]
S. Guler, W. Liang, and I. Pushee. A Video Event Detection and Mining Framework. Computer Vision and Pattern Recognition Workshop, 4, 2003.
[5]
A. Hakeem and M. Shah. Multiple agent event detection and representation in videos. The Twentieth National Conference on Artificial Intelligence (AAAI), 26, 2005.
[6]
M. Hung, C. Hsieh, and C. Kuo. Rule-based Event Detection of Broadcast Baseball Videos Using Mid-level Cues. Innovative Computing, Information and Control, 2007. ICICIC'07. Second International Conference on, pages 240--240, 2007.
[7]
A. Mustafa and I. Sethi. Unsupervised Event Detection in Videos. Tools with Artificial Intelligence, 2007. ICTAI 2007. 19th IEEE International Conference on, 2:179--182.
[8]
L. Rabiner. A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proceedings of IEEE, 77(2):257--286, 1989.
[9]
J. Wang, C. Xu, E. Chng, K. Wah, and Q. Tian. Automatic replay generation for soccer video broadcasting. Proceedings of the 12th annual ACM international conference on Multimedia, pages 32--39, 2004.
[10]
T. Wang, J. Li, Q. Diao, W. Hu, Y. Zhang, and C. Dulong. Semantic Event Detection using Conditional Random Fields. Computer Vision and Pattern Recognition Workshop, 2:109--109.
[11]
A. Yilmaz, O. Javed, and M. Shah. Object tracking: A survey. ACM Computing Surveys (CSUR), 38(4), 2006.

Index Terms

  1. Automated sip detection in naturally-evoked video

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      ICMI '08: Proceedings of the 10th international conference on Multimodal interfaces
      October 2008
      322 pages
      ISBN:9781605581989
      DOI:10.1145/1452392
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 20 October 2008

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. affective computing
      2. event detection
      3. head gesture recognition
      4. human activity recognition
      5. spontaneous video

      Qualifiers

      • Poster

      Conference

      ICMI '08
      Sponsor:
      ICMI '08: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERFACES
      October 20 - 22, 2008
      Crete, Chania, Greece

      Acceptance Rates

      Overall Acceptance Rate 453 of 1,080 submissions, 42%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 149
        Total Downloads
      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 15 Feb 2025

      Other Metrics

      Citations

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media