Abstract
In this paper, we investigate the problem of forecasting future activities in continuous videos. Ability to successfully forecast activities that are yet to be observed is a very important video understanding problem, and is starting to receive attention in the computer vision literature. We propose an activity forecasting strategy that models the simultaneous and/or sequential nature of human activities on a graph and combines that with the interrelationship between static scene cues and dynamic target trajectories, termed together as the ‘activity and scene context’. The forecasting problem is then posed as an inference problem on a MRF model defined on the graph. We perform experiments on the publicly available challenging VIRAT ground dataset and obtain high forecasting accuracy for most of the activities, as evidenced by the results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local svm approach. In: International Conference on Pattern Recognition, vol. 3, pp. 32–36 (2004)
Dollar, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: Visual Surveillance and Performance Evaluation of Tracking and Surveillance, pp. 65–72 (2005)
Ryoo, M.S., Aggarwal, J.K.: Spatio-temporal relationship match: video structure comparison for recognition of complex human activities. In: International Conference on Computer Vision (2009)
Nayak, N.M., Zhu, Y., Roy-Chowdhury, A.K.: Exploiting spatio-temporal scene structure for wide-area activity analysis in unconstrained environments. IEEE Trans. Inf. Forensics Secur. 8, 1610–1619 (2013)
Choi, W., Shahid, K., Savarese, S.: Learning context for collective activity recognition. In: Computer Vision and Pattern Recognition (2011)
Ryoo, M.S.: Human activity prediction: early recognition of ongoing activities from streaming videos. In: International Conference on Computer Vision, pp. 1036–1043 (2011)
Yu, G., Yuan, J., Liu, Z.: Predicting human activities using spatio-temporal structure of interest points. In: ACM Multimedia, pp. 1049–1052. ACM (2012)
Hoai, M., De la Torre, F.: Max-margin early event detectors. Comput. Vis. Pattern Recogn. 107, 191–202 (2014)
Kitani, K.M., Ziebart, B.D., Bagnell, J.A., Hebert, M.: Activity forecasting. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part IV. LNCS, vol. 7575, pp. 201–214. Springer, Heidelberg (2012)
Oh, S., Hoogs, A., Perera, A.G.A., Cuntoor, N.P., Chen, C.C., Lee, J.T., Mukherjee, S., Aggarwal, J.K., Lee, H., Davis, L.S., Swears, E., Wang, X., Ji, Q., Reddy, K.K., Shah, M., Vondrick, C., Pirsiavash, H., Ramanan, D., Yuen, J., Torralba, A., Song, B., Fong, A., Chowdhury, A.K.R., Desai, M.: A large-scale benchmark dataset for event recognition in surveillance video. In: CVPR, pp. 3153–3160. IEEE (2011)
Poppe, R.: A survey on vision-based human action recognition. Image Vis. Comput. 28, 976–990 (2010)
Gaur, U., Zhu, Y., Song, B., Roy-Chowdhury, A.K.: A “string of feature graphs” model for recognition of complex activities in natural videos. In: International Conference on Computer Vision (2011)
Yao, B., Feifei, L.: Modeling mutual context of object and human pose in human object interaction activities. In: Computer Vision and Pattern Recognition (2010)
Lan, T., Wang, Y., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Trans. Pattern Anal. Mach. Intell. 34, 1549–1562 (2012)
Gupta, A., Srinivasan, P., Shi, J., Davis, L.S.: Understanding videos, constructing plots learning a visually grounded storyline model from annotated videos. In: Computer Vision and Pattern Recognition (2009)
Si, Z., Pei, M., Yao, B., Zhu, S.: Unsupervised learning of event and-or grammar and semantics from video. In: International Conference on Computer Vision (2011)
Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: Computer Vision and Pattern Recognition (2013)
Zhu, Y., Nanyak, N.M., Roy-Chowdhury, A.K.: Vector field analysis for multi-object behavior modeling. IEEE J. Sel. Top. Sign. Proces. (J-STSP) 7, 91–101 (2013)
Nayak, N., Zhu, Y., Roy-Chowdhury, A.K.: Exploiting spatio-temporal scene structure for wide-area activity analysis in unconstrained environments. IEEE Trans. Inf. Forensics Secur. 8, 1610–1619 (2013)
Benmokhtar, R., Laptev, I.: INRIA-WILLOW at TRECVid2010: Surveillance Event Detection. In: TRECVID (2010)
Morariu, V.I., Davis, L.S.: Multi-agent event recognition in structured scenarios. In: Computer Vision and Pattern Recognition (2011)
Tang, K., Fei-Fei, L., Koller, D.: Learning latent temporal structure for complex event detection. In: Computer Vision and Pattern Recognition (2012)
Kschischang, F.R., Frey, B.J., Loeliger, H.A.: Factor graphs and the sum-product algorithm. IEEE Trans. Inf. Theory 47, 498–519 (1998)
Zivkovic, Z.: Improved adaptive gaussian mixture model for background subtraction. In: International Conference on Pattern Recognition, pp. 28–31 (2004)
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. IEEE Trans. Pattern Anal. Mach. Intell. 32, 1627–1645 (2010)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Computer Vision and Pattern Recognition, pp. 886–893 (2005)
Song, B., Jeng, T.Y., Staudt, E., Roy-Chowdhury, A.K.: A stochastic graph evolution framework for robust multi-target tracking. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 605–619. Springer, Heidelberg (2010)
Gorelick, L., Blank, M., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. In: International Conference on Computer Vision, pp. 1395–1402 (2005)
Acknowledgement
This work is partially supported by the National Science Foundation grant IIS-1316934.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Chakraborty, A., Roy-Chowdhury, A.K. (2015). Context-Aware Activity Forecasting. In: Cremers, D., Reid, I., Saito, H., Yang, MH. (eds) Computer Vision -- ACCV 2014. ACCV 2014. Lecture Notes in Computer Science(), vol 9007. Springer, Cham. https://doi.org/10.1007/978-3-319-16814-2_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-16814-2_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16813-5
Online ISBN: 978-3-319-16814-2
eBook Packages: Computer ScienceComputer Science (R0)