Summary
Detection and tracking algorithms generates useful information in the form of trajectories from which the behaviors and the interactions of moving objects can be inferred through the analysis of spatio-temporal features. Interactions occur either between a dynamic and a static object, or between multiple dynamic objects. This chapter presents an interaction modeling framework formulated as a state sequence estimation problem using time-series analysis. Bayesian network-based methods and their variants are studied for the analysis of interactions in videos. Moreover, techniques such as Coupled Hidden Markov Model are also discussed for more complex interactions, such as those between multiple dynamic objects. Finally, the interaction modeling is demonstrated on real surveillance and sport sequences.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Andrade, E.L., Blunsden, S., Fisher, R.B.: Modelling crowd scenes for event detection. In: Proc. of IEEE Conf. on Pattern Recognition, Hong Kong, CN (2006)
Weinland, D., Ronfard, R., Boyer, E.: Free viewpoint action recognition using motion history volumes. Elsevier Journal of Computer Vision and Image Understanding 104 (2006)
Fathi, A., Mori, G.: Action recognition by learning mid-level motion features. In: Proc. of IEEE Int. Conf. on Computer Vision and Pattern Recognition, Anchorage, AK, USA (2008)
Stauffer, C., Grimson, W.: Learning patterns of activity using real-time tracking. IEEE Trans. on Pattern Analysis and Machine Intelligence 22, 747–757 (2000)
Taj, M., Maggio, E., Cavallaro, A.: Multi-feature graph-based object tracking. In: Stiefelhagen, R., Garofolo, J.S. (eds.) CLEAR 2006. LNCS, vol. 4122, pp. 190–199. Springer, Heidelberg (2007)
Taj, M., Maggio, E., Cavallaro, A.: Objective evaluation of pedestrian and vehicle tracking on the CLEAR surveillance dataset. In: Stiefelhagen, R., Bowers, R., Fiscus, J.G. (eds.) RT 2007 and CLEAR 2007. LNCS, vol. 4625, pp. 160–173. Springer, Heidelberg (2008)
Cavallaro, A., Ebrahimi, T.: Interaction between high-level and low-level image analysis for semantic video object extraction. EURASIP Journal on Applied Signal Processing 6, 786–797 (2004)
Wu, B., Nevatia, R.: Detection of multiple, partially occluded humans in a single image by bayesian combination of edgelet part detectors. In: Proc. of IEEE Int. Conf. on Computer Vision, Washington, DC, USA, pp. 90–97 (2005)
Viola, P., Jones, M., Snow, D.: Detecting pedestrians using patterns of motion and appearance. In: Proc. of Int. Conf. on Computer Vision Systems, Nice, FR (2003)
Shafique, K., Shah, M.: A noniterative greedy algorithm for multiframe point correspondence. IEEE Trans. on Pattern Analysis and Machine Intelligence 27, 51–65 (2005)
Yilmaz, A., Javed, O., Shah, M.: Object tracking: A survey. ACM Computing Surveys (CSUR) 38, 1–45 (2006)
Maggio, E., Smeraldi, F., Cavallaro, A.: Adaptive multifeature tracking in a particle filtering framework. IEEE Trans. on Circuits System and Video Technology 17, 1348–1359 (2007)
Maggio, E., Piccardo, E., Regazzoni, C., Cavallaro, A.: Particle PHD filter for multi-target visual tracking. In: Proc. of IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Honolulu, HI, USA (2007)
Karlsson, S., Taj, M., Cavallaro, A.: Detection and tracking of humans and faces. EURASIP Journal on Image and Video Processing, 1–9 (2008)
Zhou, H., Taj, M., Cavallaro, A.: Target detection and tracking with heterogeneous sensors. IEEE Journal of Selected Topics In Signal Processing 2 (2008)
Taj, M., Cavallaro, A.: Multi-camera track-before-detect. In: Proc. of ACM/IEEE Int. Conf. on Distributed Smart Cameras, Como, IT (2009)
Taj, M., Cavallaro, A.: Multi-camera scene analysis using an object-centric continuous distribution hidden Markov model. In: Proc. of IEEE Int. Conf. on Image Processing, San Antonio, TX, USA (2007)
Taj, M., Cavallaro, A.: Object and scene-centric activity detection using state occupancy duration modeling. In: Proc. of IEEE Int. Conf. on Advanced Video and Signal Based Surveillance, Santa Fe, NM, USA (2008)
Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: Proc. of IEEE Int. Conf. on Computer Vision and Pattern Recognition, Anchorage, AK, USA (2008)
Liu, J., Ali, S., Shah, M.: Recognizing human actions using multiple features. In: Proc. of IEEE Int. Conf. on Computer Vision and Pattern Recognition, Anchorage, AK, USA (2008)
Velipasalar, S., Brown, L., Hampapur, A.: Specifying, interpreting and detecting high-level, spatio-temporal composite events in single and multi-camera systems. In: Proc. of IEEE Int. Conf. on Computer Vision and Pattern Recognition, NY, USA (2006)
Rezek, I., Gibbs, M., Roberts, S.J.: Maximum a posteriori estimation of coupled hidden Markov models. Journal of VLSI Signal Processing Systems 32, 55–66 (2002)
Mahmood, T.S., Vasilescu, A., Sethi, S.: Recognizing action events from multiple view points. In: Proc. of IEEE Workshop on Detection and Recognition of Events in Video, Madison, WI, USA (2001)
Ghanem, N., DeMenthon, D., Doermann, D., Davis, L.: Representation and recognition of events in surveillance video using Petri nets. In: Proc. of IEEE Int. Conf. on Computer Vision and Pattern Recognition, Washington, DC, USA (2004)
Wang, Y.: The variable-length hidden Markov model and its applications on sequential data mining. Technical report, Tsinghua University, Beijing, CN (2006), http://learn.tsinghua.edu.cn:8080/2001315444/VLHMM/icdm-techreport.pdf (last accessed: June 9, 2008)
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Heidelberg (2006)
Medioni, G.G., Cohen, I., Bremond, F., Hongeng, S., Nevatia, R.: Event detection and analysis from video streams. IEEE Trans. on Pattern Analysis and Machine Intelligence 23, 873–889 (2001)
Rabiner, L.R.: A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Kaufmann, San Mateo (1990)
Natarajan, P., Nevatia, R.: Coupled hidden semi Markov models for activity recognition. In: IEEE Int. Workshop on Motion and Video Computing, Austin, TX, USA (2007)
i LIDS Team, Imagery library for intelligent detection systems (i-lids); a standard for testing video based detection systems. In: Proc. of IEEE Int. Carnahan Conf. on Security Technology, pp. 75–80 (2006)
Cher, D.: ETISEO Metrics Definition. Silogic, Toulouse Cedex 1, FR (2006), https://www-sop.inria.fr/orion/ETISEO/iso_album/eti-metrics_definition-v2.pdf (last accessed: June 30, 2009)
Ferryman, J.: Performance evaluation of tracking and surveillance. In: Conj. with IEEE Int. Conf. on Computer Vision and Pattern Recognition (2006), http://www.cvg.rdg.ac.uk/PETS2006/data.html (last accessed: June 30, 2009)
Fisher, R.: Caviar: Context aware vision using image-based active recognition (2001-2005), http://homepages.inf.ed.ac.uk/rbf/CAVIAR/caviar.htm (last accessed: June 30, 2009)
Zotkin, D., Duraiswami, R., Davis, L.: Multimodal 3-D tracking and event detection via the particle filter. In: Proc. of IEEE Workshop on Detection and Recognition of Events in Video, Vancouver, CA (2001)
Andrade, E.L., Blunsden, S., Fisher, R.B.: Detection of emergency events in crowded scenes. In: IEE Int. Symp. on Imaging for Crime Detection and Prevention, London, UK (2006)
Wu, G., Wu, Y., Jiao, L., Wang, Y., Chang, E.Y.: Multi-camera spatio-temporal fusion and biased sequence-data learning for security surveillance. In: Proc. of ACM Int. Conf. on Multimedia, NY, USA (2003)
Ke, Y., Sukthankar, R., Hebert, M.: Efficient visual event detection using volumetric features. In: Proc. of IEEE Int. Conf. on Computer Vision, Beijing, CN (2005)
Morris, R.J., Hogg, D.C.: Statistical models of object interaction. Int. Journal on Computer Vision 37, 209–215 (2000)
Brand, M.: Coupled hidden Markov models for modeling interacting processes. MIT media lab perceptual computing / learning and common sense technical report 405, Massachusetts Institute of Technology (1997), http://citeseer.ist.psu.edu/7422.html (last accessed: December 30, 2008)
Oliver, N., Rosario, B., Pentland, A.: A bayesian computer vision system for modeling human interactions. IEEE Trans. on Pattern Analysis and Machine Intelligence 22, 831–843 (2000)
Chartrand, G.: Introductory Graph Theory. In: Directed Graphs as Mathematical Models, ch. 1, pp. 16–19. Dover Publications, New York (1985)
Murphy, K.: Dynamic Bayesian networks: Representation, inference and learning. PhD thesis, Department of Computer Science, UC Berkeley (2002)
Zhang, L., Samaras, D., Klein, N.A., Volkow, N., Goldstein, R.: Modeling neuronal interactivity using dynamic bayesian networks. In: Weiss, Y., Schölkopf, B., Platt, J. (eds.) Advances in Neural Information Processing Systems, vol. 18, pp. 1593–1600. MIT Press, Cambridge (2006)
Brand, M., Kettnaker, V.: Discovery and segmentation of activities in video. IEEE Trans. on Pattern Analysis and Machine Intelligence 22, 844–851 (2000)
Galata, A., Cohn, A., Magee, D., Hogg, D.: Modeling interaction using learnt qualitative spatio-temporal relations and variable length Markov models. In: Proc. of European Conf. on Artificial Intelligence, Lyon, FR (2002)
Marhasev, E., Hadad, M., Kaminka, G.A.: Non-stationary hidden semi Markov models in activity recognition. In: Proc. of the AAAI Workshop on Modeling Others from Observations, Boston, MA, USA (2006)
Russell, M., Moore, R.: Explicit modelling of state occupancy in hidden Markov models for automatic speech recognition. In: Proc. of IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Tampa, FL, USA (1985)
Burshtein, D.: Robust parametric modeling of durations in hidden Markov models. IEEE Trans. on Speech and Audio Processing 4, 240–242 (1996)
Auvinet, E., Grossmann, E., Rougier, C., Dahmane, M., Meunier, J.: Left-luggage detection using homographies and simple heuristics. In: Joint IEEE Int. Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, NY, USA (2006)
RATP FR, Call for Real-Time Event Detection Solutions (CREDS) for Enhanced Security and Safety in Public Transportation (2005)
McLachlan, G., Krishnan, T.: The EM Algorithm and Extensions, vol. 2. John Wiley & Sons, New York (1996)
Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. of IEEE, 267–296 (1990)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Taj, M., Cavallaro, A. (2010). Recognizing Interactions in Video. In: Sencar, H.T., Velastin, S., Nikolaidis, N., Lian, S. (eds) Intelligent Multimedia Analysis for Security Applications. Studies in Computational Intelligence, vol 282. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11756-5_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-11756-5_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11754-1
Online ISBN: 978-3-642-11756-5
eBook Packages: EngineeringEngineering (R0)