poster

A Markov logic framework for recognizing complex events from multimodal data

Authors:

Young Chol Song,

Ce ZhangAuthors Info & Claims

ICMI '13: Proceedings of the 15th ACM on International conference on multimodal interaction

Pages 141 - 148

https://doi.org/10.1145/2522848.2522883

Published: 09 December 2013 Publication History

Abstract

We present a general framework for complex event recognition that is well-suited for integrating information that varies widely in detail and granularity. Consider the scenario of an agent in an instrumented space performing a complex task while describing what he is doing in a natural manner. The system takes in a variety of information, including objects and gestures recognized by RGB-D and descriptions of events extracted from recognized and parsed speech. The system outputs a complete reconstruction of the agent's plan, explaining actions in terms of more complex activities and filling in unobserved but necessary events. We show how to use Markov Logic (a probabilistic extension of first-order logic) to create a model in which observations can be partial, noisy, and refer to future or temporally ambiguous events; complex events are composed from simpler events in a manner that exposes their structure for inference and learning; and uncertainty is handled in a sound probabilistic manner. We demonstrate the effectiveness of the approach for tracking kitchen activities in the presence of noisy and incomplete observations.

References

[1]

J. Allen, M. Swift, and W. de Beaumont. Deep semantic analysis of text. In Proc. Semantics in Text Processing, STEP '08, pages 343--354, 2008.

Digital Library

[2]

J. F. Allen. Maintaining knowledge about temporal intervals. Communications of the ACM, 26(11):832--843, Nov. 1983.

Digital Library

[3]

C. Ansótegui, M. L. Bonet, and J. Levy. Sat-based maxsat algorithms. Artifical Intelligence, 196, 2013.

Digital Library

[4]

S. Blackman. Multiple-target tracking with radar applications. Artech House radar library. Artech House, 1986.

[5]

W. Brendel, A. Fern, and S. Todorovic. Probabilistic event logic for interval-based event recognition. In 24th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2011), pages 3329--3336, 2011.

Digital Library

[6]

H. H. Bui. A general model for online probabilistic plan recognition. In Eighteenth International Joint Conference on Artificial Intelligence (IJCAI-2003), 2003.

Digital Library

[7]

D. Comaniciu, V. Ramesh, and P. Meer. Kernel-based object tracking. IEEE Trans. Pattern Anal. Mach. Intell., 25(5):564--575, May 2003.

Digital Library

[8]

N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01, CVPR '05, pages 886--893, Washington, DC, USA, 2005. IEEE Computer Society.

Digital Library

[9]

S. Gupta and R. J. Mooney. Using closed captions as supervision for video activity recognition. In Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI-2010), 2010.

Digital Library

[10]

H. Kautz. A formal theory of plan recognition and its implementation. In J. Allen, H. Kautz, R. Pelavin, and J. Tennenberg, editors, Reasoning About Plans, pages 69--126. Morgan Kaufmann Publishers, 1991.

Digital Library

[11]

A. Kembhavi, T. Yeh, and L. Davis. Why did the person cross the road (there)? scene understanding using probabilistic logic models and common sense reasoning. In 11th European Conference on Computer Vision (EECV 2010), 2010.

Digital Library

[12]

J. Lei, X. Ren, and D. Fox. Fine-grained kitchen activity recognition using rgb-d. In Proceedings of the 2012 ACM Conference on Ubiquitous Computing (UbiComp '12), pages 208--211, 2012.

Digital Library

[13]

C. Matuszek, N. FitzGerald, L. S. Zettlemoyer, L. Bo, and D. Fox. A joint model of language and perception for grounded attribute learning. In 29th International Conference on Machine Learning (ICML 2012), 2012.

[14]

D. Moore and I. Essa. Recognizing multitasked activities using stochastic context-free grammar. In In Proceedings of AAAI Conference, 2001.

Digital Library

[15]

V. I. Morariu and L. S. Davis. Multi-agent event recognition in structured scenarios. In 24th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2011), 2011.

Digital Library

[16]

S. Natarajan, H. H. Bui, P. Tadepalli, K. Kersting, and W. Wong. Logical hierarchical hidden Markov models for modeling user activities. In In Proc. of ILP-08, 2008.

Digital Library

[17]

F. Niu, C. Ré, A. Doan, and J. W. Shavlik. Tuffy: Scaling up statistical inference in markov logic networks using an rdbms. Proceedings of the VLDB Endowment (PVLDB), 4(6):373--384, 2011.

Digital Library

[18]

M. Richardson and P. Domingos. Markov logic networks. Mach. Learn., 62(1--2):107--136, 2006.

Digital Library

[19]

C. F. Schmidt, N. S. Sridharan, and J. L. Goodson. The plan recognition problem: An intersection of psychology and artificial intelligence. Artifical Intelligence, 11(1--2), 1978.

[20]

Y. Shi, Y. Huang, D. Minnen, A. Bobick, and I. Essa. Propagation Networks for Recognition of Partially Ordered Sequential Action. In Proceedings of IEEE CVPR04, 2004.

Digital Library

[21]

P. Singla and R. J. Mooney. Abductive Markov Logic for plan recognition. In Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence (AAAI 2011), 2011.

Digital Library

[22]

M. Swift, G. Ferguson, L. Galescu, Y. Chu, C. Harman, H. Jung, I. Perera, Y. Song, J. Allen, and H. Kautz. A multimodal corpus for integrated language and action. In Proc. of the Int. Workshop on MultiModal Corpora for Machine Learning, 2012.

[23]

S. Tran and L. Davis. Visual event modeling and recognition using markov logic networks. In 10th European Conference on Computer Vision (EECV 2008), 2008.

Digital Library

[24]

P. Viola and M. J. Jones. Robust real-time face detection. Int. J. Comput. Vision, 57(2):137--154, May 2004.

Digital Library

Cited By

Avola DCinque LDel Bimbo AMarini M(2020)MIFTel: a multimodal interactive framework based on temporal logic rulesMultimedia Tools and Applications10.1007/s11042-019-08590-1Online publication date: 31-Jan-2020
https://doi.org/10.1007/s11042-019-08590-1
Xiao ZJiang JMing Z(2020)High Level Video Event Modeling, Recognition and Reasoning via Petri NetArtificial Intelligence and Robotics10.1007/978-3-030-56178-9_6(69-90)Online publication date: 11-Nov-2020
https://doi.org/10.1007/978-3-030-56178-9_6
Xiao ZJiang JMing Z(2019)High-Level Video Event Modeling, Recognition, and Reasoning via Petri NetIEEE Access10.1109/ACCESS.2019.29364937(129376-129386)Online publication date: 2019
https://doi.org/10.1109/ACCESS.2019.2936493
Show More Cited By

Index Terms

A Markov logic framework for recognizing complex events from multimodal data
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Image and video acquisition
        3D imaging
    2. Knowledge representation and reasoning
      1. Probabilistic reasoning
      2. Vagueness and fuzzy logic

Recommendations

A general framework for recognizing complex events in Markov logic
AAAIWS'13-13: Proceedings of the 13th AAAI Conference on Plan, Activity, and Intent Recognition

We present a robust framework for complex event recognition that is well-suited for integrating information that varies widely in detail and granularity. Consider the scenario of an agent in an instrumented space performing a complex task while ...
News events prediction using Markov logic networks

Predicting future events from text data has been a controversial and much disputed topic in the field of text analytics. However, far too little attention has been paid to efficient prediction in textual environments. This study has aimed to develop a ...
A general framework for recognizing complex events in Markov logic
AAAIWS'13-16: Proceedings of the 16th AAAI Conference on Statistical Relational Artificial Intelligence

We present a robust framework for complex event recognition that is well-suited for integrating information that varies widely in detail and granularity. Consider the scenario of an agent in an instrumented space performing a complex task while ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICMI '13: Proceedings of the 15th ACM on International conference on multimodal interaction

December 2013

630 pages

ISBN:9781450321297

DOI:10.1145/2522848

General Chairs:
Julien Epps
The University of New South Wales, Australia
,
Fang Chen
National ICT Australia, Australia
,
Sharon Oviatt
Incaa Designs, USA
,
Kenji Mase
Nagoya University, Japan
,
Program Chairs:
Andrew Sears
Rochester Institute of Technology, USA
,
Kristiina Jokinen
University of Helsinki, Finland
,
Björn Schuller
Technische Universität München, Germany

Copyright © 2013 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGCHI: ACM Special Interest Group on Computer-Human Interaction

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 December 2013

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Poster

Conference

ICMI '13

Sponsor:

SIGCHI

ICMI '13: 2013 International Conference on Multimodal Interaction

December 9 - 13, 2013

Sydney, Australia

Acceptance Rates

ICMI '13 Paper Acceptance Rate 49 of 133 submissions, 37%;

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

16
Total Citations
View Citations
237
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)0

Reflects downloads up to 01 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Avola DCinque LDel Bimbo AMarini M(2020)MIFTel: a multimodal interactive framework based on temporal logic rulesMultimedia Tools and Applications10.1007/s11042-019-08590-1Online publication date: 31-Jan-2020
https://doi.org/10.1007/s11042-019-08590-1
Xiao ZJiang JMing Z(2020)High Level Video Event Modeling, Recognition and Reasoning via Petri NetArtificial Intelligence and Robotics10.1007/978-3-030-56178-9_6(69-90)Online publication date: 11-Nov-2020
https://doi.org/10.1007/978-3-030-56178-9_6
Xiao ZJiang JMing Z(2019)High-Level Video Event Modeling, Recognition, and Reasoning via Petri NetIEEE Access10.1109/ACCESS.2019.29364937(129376-129386)Online publication date: 2019
https://doi.org/10.1109/ACCESS.2019.2936493
Hwang SKim HRavi SCollins MTao ZSingh V(2018)Tensorize, Factorize and Regularize: Robust Visual Relationship Learning2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition10.1109/CVPR.2018.00112(1014-1023)Online publication date: Jun-2018
https://doi.org/10.1109/CVPR.2018.00112
Rincé RKervarc RLeray P(2018)Complex Event Processing Under Uncertainty Using Markov Chains, Constraints, and SamplingRules and Reasoning10.1007/978-3-319-99906-7_10(147-163)Online publication date: 24-Aug-2018
https://doi.org/10.1007/978-3-319-99906-7_10
Alevizos ESkarlatidis AArtikis APaliouras G(2017)Probabilistic Complex Event RecognitionACM Computing Surveys10.1145/311780950:5(1-31)Online publication date: 26-Sep-2017
https://dl.acm.org/doi/10.1145/3117809
Suchan J(2017)Declarative Reasoning about Space and Motion with VideoKI - Künstliche Intelligenz10.1007/s13218-017-0504-x31:4(321-330)Online publication date: 20-Aug-2017
https://doi.org/10.1007/s13218-017-0504-x
Pitsikalis VKatsamanis ATheodorakis SMaragos P(2017)Multimodal Gesture Recognition via Multiple Hypotheses RescoringGesture Recognition10.1007/978-3-319-57021-1_16(467-496)Online publication date: 20-Jul-2017
https://doi.org/10.1007/978-3-319-57021-1_16
Raedt LKersting KNatarajan SPoole D(2016)Statistical Relational Artificial Intelligence: Logic, Probability, and ComputationSynthesis Lectures on Artificial Intelligence and Machine Learning10.2200/S00692ED1V01Y201601AIM03210:2(1-189)Online publication date: 24-Mar-2016
https://doi.org/10.2200/S00692ED1V01Y201601AIM032
Prendinger HAlvarez NSanchez-Ruiz ACavazza MCatarino jOliveira JPrada RFujimoto SShigematsu M(2016)Intelligent Biohazard Training Based on Real-Time Task RecognitionACM Transactions on Interactive Intelligent Systems10.1145/28836176:3(1-32)Online publication date: 21-Sep-2016
https://dl.acm.org/doi/10.1145/2883617
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten