Skip to main content

Context-Aware Activity Forecasting

  • Conference paper
  • First Online:
Computer Vision -- ACCV 2014 (ACCV 2014)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9007))

Included in the following conference series:

Abstract

In this paper, we investigate the problem of forecasting future activities in continuous videos. Ability to successfully forecast activities that are yet to be observed is a very important video understanding problem, and is starting to receive attention in the computer vision literature. We propose an activity forecasting strategy that models the simultaneous and/or sequential nature of human activities on a graph and combines that with the interrelationship between static scene cues and dynamic target trajectories, termed together as the ‘activity and scene context’. The forecasting problem is then posed as an inference problem on a MRF model defined on the graph. We perform experiments on the publicly available challenging VIRAT ground dataset and obtain high forecasting accuracy for most of the activities, as evidenced by the results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local svm approach. In: International Conference on Pattern Recognition, vol. 3, pp. 32–36 (2004)

    Google Scholar 

  2. Dollar, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: Visual Surveillance and Performance Evaluation of Tracking and Surveillance, pp. 65–72 (2005)

    Google Scholar 

  3. Ryoo, M.S., Aggarwal, J.K.: Spatio-temporal relationship match: video structure comparison for recognition of complex human activities. In: International Conference on Computer Vision (2009)

    Google Scholar 

  4. Nayak, N.M., Zhu, Y., Roy-Chowdhury, A.K.: Exploiting spatio-temporal scene structure for wide-area activity analysis in unconstrained environments. IEEE Trans. Inf. Forensics Secur. 8, 1610–1619 (2013)

    Google Scholar 

  5. Choi, W., Shahid, K., Savarese, S.: Learning context for collective activity recognition. In: Computer Vision and Pattern Recognition (2011)

    Google Scholar 

  6. Ryoo, M.S.: Human activity prediction: early recognition of ongoing activities from streaming videos. In: International Conference on Computer Vision, pp. 1036–1043 (2011)

    Google Scholar 

  7. Yu, G., Yuan, J., Liu, Z.: Predicting human activities using spatio-temporal structure of interest points. In: ACM Multimedia, pp. 1049–1052. ACM (2012)

    Google Scholar 

  8. Hoai, M., De la Torre, F.: Max-margin early event detectors. Comput. Vis. Pattern Recogn. 107, 191–202 (2014)

    Google Scholar 

  9. Kitani, K.M., Ziebart, B.D., Bagnell, J.A., Hebert, M.: Activity forecasting. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part IV. LNCS, vol. 7575, pp. 201–214. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  10. Oh, S., Hoogs, A., Perera, A.G.A., Cuntoor, N.P., Chen, C.C., Lee, J.T., Mukherjee, S., Aggarwal, J.K., Lee, H., Davis, L.S., Swears, E., Wang, X., Ji, Q., Reddy, K.K., Shah, M., Vondrick, C., Pirsiavash, H., Ramanan, D., Yuen, J., Torralba, A., Song, B., Fong, A., Chowdhury, A.K.R., Desai, M.: A large-scale benchmark dataset for event recognition in surveillance video. In: CVPR, pp. 3153–3160. IEEE (2011)

    Google Scholar 

  11. Poppe, R.: A survey on vision-based human action recognition. Image Vis. Comput. 28, 976–990 (2010)

    Article  Google Scholar 

  12. Gaur, U., Zhu, Y., Song, B., Roy-Chowdhury, A.K.: A “string of feature graphs” model for recognition of complex activities in natural videos. In: International Conference on Computer Vision (2011)

    Google Scholar 

  13. Yao, B., Feifei, L.: Modeling mutual context of object and human pose in human object interaction activities. In: Computer Vision and Pattern Recognition (2010)

    Google Scholar 

  14. Lan, T., Wang, Y., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Trans. Pattern Anal. Mach. Intell. 34, 1549–1562 (2012)

    Article  Google Scholar 

  15. Gupta, A., Srinivasan, P., Shi, J., Davis, L.S.: Understanding videos, constructing plots learning a visually grounded storyline model from annotated videos. In: Computer Vision and Pattern Recognition (2009)

    Google Scholar 

  16. Si, Z., Pei, M., Yao, B., Zhu, S.: Unsupervised learning of event and-or grammar and semantics from video. In: International Conference on Computer Vision (2011)

    Google Scholar 

  17. Zhu, Y., Nayak, N.M., Roy-Chowdhury, A.K.: Context-aware modeling and recognition of activities in video. In: Computer Vision and Pattern Recognition (2013)

    Google Scholar 

  18. Zhu, Y., Nanyak, N.M., Roy-Chowdhury, A.K.: Vector field analysis for multi-object behavior modeling. IEEE J. Sel. Top. Sign. Proces. (J-STSP) 7, 91–101 (2013)

    Article  Google Scholar 

  19. Nayak, N., Zhu, Y., Roy-Chowdhury, A.K.: Exploiting spatio-temporal scene structure for wide-area activity analysis in unconstrained environments. IEEE Trans. Inf. Forensics Secur. 8, 1610–1619 (2013)

    Google Scholar 

  20. Benmokhtar, R., Laptev, I.: INRIA-WILLOW at TRECVid2010: Surveillance Event Detection. In: TRECVID (2010)

    Google Scholar 

  21. Morariu, V.I., Davis, L.S.: Multi-agent event recognition in structured scenarios. In: Computer Vision and Pattern Recognition (2011)

    Google Scholar 

  22. Tang, K., Fei-Fei, L., Koller, D.: Learning latent temporal structure for complex event detection. In: Computer Vision and Pattern Recognition (2012)

    Google Scholar 

  23. Kschischang, F.R., Frey, B.J., Loeliger, H.A.: Factor graphs and the sum-product algorithm. IEEE Trans. Inf. Theory 47, 498–519 (1998)

    Article  MathSciNet  Google Scholar 

  24. Zivkovic, Z.: Improved adaptive gaussian mixture model for background subtraction. In: International Conference on Pattern Recognition, pp. 28–31 (2004)

    Google Scholar 

  25. Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. IEEE Trans. Pattern Anal. Mach. Intell. 32, 1627–1645 (2010)

    Article  Google Scholar 

  26. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Computer Vision and Pattern Recognition, pp. 886–893 (2005)

    Google Scholar 

  27. Song, B., Jeng, T.Y., Staudt, E., Roy-Chowdhury, A.K.: A stochastic graph evolution framework for robust multi-target tracking. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 605–619. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  28. Gorelick, L., Blank, M., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. In: International Conference on Computer Vision, pp. 1395–1402 (2005)

    Google Scholar 

Download references

Acknowledgement

This work is partially supported by the National Science Foundation grant IIS-1316934.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amit K. Roy-Chowdhury .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material (pdf 601 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Chakraborty, A., Roy-Chowdhury, A.K. (2015). Context-Aware Activity Forecasting. In: Cremers, D., Reid, I., Saito, H., Yang, MH. (eds) Computer Vision -- ACCV 2014. ACCV 2014. Lecture Notes in Computer Science(), vol 9007. Springer, Cham. https://doi.org/10.1007/978-3-319-16814-2_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-16814-2_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-16813-5

  • Online ISBN: 978-3-319-16814-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics