Skip to main content

Bayesian Tracking of Video Graphs Using Joint Kalman Smoothing and Registration

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 (ECCV 2022)

Abstract

Graph-based representations are becoming increasingly popular for representing and analyzing video data, especially in object tracking and scene understanding applications. Accordingly, an essential tool in this approach is to generate statistical inferences for graphical time series associated with videos. This paper develops a Kalman-smoothing method for estimating graphs from noisy, cluttered, and incomplete data. The main challenge here is to find and preserve the registration of nodes (salient detected objects) across time frames when the data has noise and clutter due to false and missing nodes. First, we introduce a quotient-space representation of graphs that incorporates temporal registration of nodes, and we use that metric structure to impose a dynamical model on graph evolution. Then, we derive a Kalman smoother, adapted to the quotient space geometry, to estimate dense, smooth trajectories of graphs. We demonstrate this framework using simulated data and actual video graphs extracted from the Multiview Extended Video with Activities (MEVA) dataset. This framework successfully estimates graphs despite the noise, clutter, and missed detections.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aakur, S., de Souza, F.D., Sarkar, S.: Going deeper with semantics: video activity interpretation using semantic contextualization. In: 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 190–199. IEEE (2019)

    Google Scholar 

  2. Aakur, S.N., de Souza, F.D.M., Sarkar, S.: Generating open world descriptions of video using common sense knowledge in a pattern theory framework. Q. Appl. Math. 77, 323–356 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  3. Adeli, V., et al.: TRiPOD: human trajectory and pose dynamics forecasting in the wild. CoRR abs/2104.04029 (2021). https://arxiv.org/abs/2104.04029

  4. Brasó, G., Leal-Taixé, L.: Learning a neural solver for multiple object tracking. CoRR abs/1912.07515 (2019). http://arxiv.org/abs/1912.07515

  5. Bronstein, M.M., Bruna, J., LeCun, Y., Szlam, A., Vandergheynst, P.: Geometric deep learning: going beyond Euclidean data. IEEE Signal Process. Mag. 34(4), 18–42 (2017)

    Article  Google Scholar 

  6. Calissano, A., Feragen, A., Vantini, S.: Populations of unlabeled networks: graph space geometry and geodesic principal components (2020)

    Google Scholar 

  7. Cao, D., et al.: Spectral temporal graph neural network for multivariate time-series forecasting. In: Advances in Neural Information Processing Systems 33, pp. 17766–17778 (2020)

    Google Scholar 

  8. Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13

    Chapter  Google Scholar 

  9. Che, Z., Purushotham, S., Cho, K., Sontag, D., Liu, Y.: Recurrent neural networks for multivariate time series with missing values. Sci. Rep. 8, 6085 (2018)

    Article  Google Scholar 

  10. Chen, F., Chen, Z., Biswas, S., Lei, S., Ramakrishnan, N., Lu, C.T.: Graph convolutional networks with kalman filtering for traffic prediction. In: 28th International Conference on Advances in Geographic Information Systems (SIGSPATIAL 2020) (2020)

    Google Scholar 

  11. Cheng, D., Yang, F., Xiang, S., Liu, J.: Financial time series forecasting with multi-modality graph neural network. Pattern Recogn. 121, 108218 (2022)

    Google Scholar 

  12. Corona, K., Osterdahl, K., Collins, R., Hoogs, A.: MEVA: a large-scale multiview, multimodal video dataset for activity detection. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 1060–1068, January 2021

    Google Scholar 

  13. Gold, S., Rangarajan, A.: A graduated assignment algorithm for graph matching. IEEE Trans. Pattern Anal. Mach. Intell. 18(4), 377–388 (1996)

    Article  Google Scholar 

  14. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press (2016). http://www.deeplearningbook.org

  15. Guo, X., Bal, A.B., Needham, T., Srivastava, A.: Statistical shape analysis of brain arterial networks (BAN). Ann. Appl. Stat. 16(2), 1130–1150 (2022)

    Article  MathSciNet  MATH  Google Scholar 

  16. Guo, X., Srivastava, A., Sarkar, S.: A quotient space formulation for statistical analysis of graphical data. J. Math. Imaging Vis. 63, 735–752 (2021)

    Article  MathSciNet  MATH  Google Scholar 

  17. Haykin, S.: Kalman Filtering and Neural Networks, vol. 47. Wiley, Hoboken (2004)

    Google Scholar 

  18. Hewamalage, H., Bergmeir, C., Bandara, K.: Recurrent neural networks for time series forecasting: current status and future directions. Int. J. Forecast. 37(1), 388–427 (2021)

    Article  Google Scholar 

  19. Huang, Y., Bi, H., Li, Z., Mao, T., Wang, Z.: STGAT: modeling spatial-temporal interactions for human trajectory prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6272–6281 (2019)

    Google Scholar 

  20. Ivanovic, B., Pavone, M.: The trajectron: probabilistic multi-agent trajectory modeling with dynamic spatiotemporal graphs. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2375–2384 (2019)

    Google Scholar 

  21. Jain, B.J.: On the geometry of graph spaces. Discrete App. Math. 214, 126–144 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  22. Jain, B.J.: Statistical graph space analysis. Pattern Recogn. 60, 802–812 (2016)

    Article  MATH  Google Scholar 

  23. Ji, J., Krishna, R., Fei-Fei, L., Niebles, J.C.: Action genome: actions as compositions of spatio-temporal scene graphs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10236–10247 (2020)

    Google Scholar 

  24. Knyazev, A., Malyshev, A.: Accelerated graph-based nonlinear denoising filters. Procedia Comput. Sci. 80, 607–616 (2016)

    Article  Google Scholar 

  25. Kosaraju, V., Sadeghian, A., Martín-Martín, R., Reid, I., Rezatofighi, S.H., Savarese, S.: Social-BiGAT: multimodal trajectory forecasting using bicycle-GAN and graph attention networks. arXiv preprint arXiv:1907.03395 (2019)

  26. Krishna, R., et al.: Visual genome: connecting language and vision using crowdsourced dense image annotations. Int. J. Comput. Vis. 123(1), 32–73 (2017)

    Article  MathSciNet  Google Scholar 

  27. Li, J., Gao, X., Jiang, T.: Graph networks for multiple object tracking. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), March 2020

    Google Scholar 

  28. Liu, H., Singh, P.: ConceptNet-a practical commonsense reasoning tool-kit. BT Technol. J. 22(4), 211–226 (2004)

    Article  Google Scholar 

  29. Lu, X., Wang, W., Danelljan, M., Zhou, T., Shen, J., Gool, L.V.: Video object segmentation with episodic graph memory networks. CoRR abs/2007.07020 (2020). https://arxiv.org/abs/2007.07020

  30. Lyzinski, V., Fishkind, D.E., Fiori, M., Vogelstein, J.T., Priebe, C.E., Sapiro, G.: Graph matching: relax at your own risk. IEEE Trans. Pattern Anal. Mach. Intell. 38(1), 60–73 (2016)

    Article  Google Scholar 

  31. Mohamed, A., Qian, K., Elhoseiny, M., Claudel, C.: Social-STGCNN: a social spatio-temporal graph convolutional neural network for human trajectory prediction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14424–14432 (2020)

    Google Scholar 

  32. Paaßen, B., Göpfert, C., Hammer, B.: Time series prediction for graphs in kernel and dissimilarity spaces. Neural Process. Lett. 48(2), 669–689 (2018)

    Article  Google Scholar 

  33. Rudi, A., Ciliberto, C., Marconi, G., Rosasco, L.: Manifold structured prediction. In: Advances in Neural Information Processing Systems 31 (2018)

    Google Scholar 

  34. Salzmann, T., Ivanovic, B., Chakravarty, P., Pavone, M.: Trajectron++: dynamically-feasible trajectory forecasting with heterogeneous data. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020, Part XVIII. LNCS, vol. 12363, pp. 683–700. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58523-5_40

    Chapter  Google Scholar 

  35. Shi, L.: Kalman filtering over graphs: theory and applications. IEEE Trans. Autom. Control 54(9), 2230–2234 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  36. Song, C., Lin, Y., Guo, S., Wan, H.: Spatial-temporal synchronous graph convolutional networks: a new framework for spatial-temporal network data forecasting. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 914–921 (2020)

    Google Scholar 

  37. Speer, R., Chin, J., Havasi, C.: ConceptNet 5.5: an open multilingual graph of general knowledge. In: Thirty-First AAAI conference on artificial intelligence (2017)

    Google Scholar 

  38. Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. CoRR arXiv:1409.3215 (2014)

  39. Tealab, A.: Time series forecasting using artificial neural networks methodologies: a systematic review. Future Comput. Inform. J. 3(2), 334–340 (2018)

    Article  Google Scholar 

  40. Vaswani, A., et al.: Attention is all you need. arXiv:1706.03762 (2017)

  41. Vázquez-Enríquez, M., Alba-Castro, J.L., Docío-Fernández, L., Rodríguez-Banga, E.: Isolated sign language recognition with multi-scale spatial-temporal graph convolutional networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3462–3471 (2021)

    Google Scholar 

  42. Vogelstein, J.T., et al.: Fast approximate quadratic programming for graph matching. PLOS One 10(4), e0121002 (2015)

    Google Scholar 

  43. Wang, C., Gao, D., Qiu, Y., Scherer, S.: Lifelong graph learning. In: 2022 Conference on Computer Vision and Pattern Recognition (CVPR) (2022)

    Google Scholar 

  44. Wang, C., Cai, S., Tan, G.: GraphTCN: spatio-temporal interaction modeling for human trajectory prediction. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 3450–3459 (2021)

    Google Scholar 

  45. Wang, W., Lu, X., Shen, J., Crandall, D.J., Shao, L.: Zero-shot video object segmentation via attentive graph neural networks. CoRR abs/2001.06807 (2020). https://arxiv.org/abs/2001.06807

  46. Wang, X., Gupta, A.: Videos as space-time region graphs. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11209, pp. 413–431. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01228-1_25

    Chapter  Google Scholar 

  47. Wang, Y., Kitani, K., Weng, X.: Joint object detection and multi-object tracking with graph neural networks. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 13708–13715. IEEE (2021)

    Google Scholar 

  48. Weng, X., Wang, Y., Man, Y., Kitani, K.M.: GNN3DMOT: graph neural network for 3D multi-object tracking with 2D–3D multi-feature learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6499–6508 (2020)

    Google Scholar 

  49. Wu, Z., Pan, S., Long, G., Jiang, J., Chang, X., Zhang, C.: Connecting the dots: multivariate time series forecasting with graph neural networks. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 753–763 (2020)

    Google Scholar 

  50. Yan, S., Xiong, Y., Lin, D.: Spatial temporal graph convolutional networks for skeleton-based action recognition. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)

    Google Scholar 

  51. Yu, B., Yin, H., Zhu, Z.: Spatio-temporal graph convolutional networks: a deep learning framework for traffic forecasting. In: IJCAI (2018)

    Google Scholar 

Download references

Acknowledgements

This research was supported in part by the US National Science Foundation grants 1955154, IIS 2143150, IIS 1955230, CNS 1513126, and IIS 1956050.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aditi Basu Bal .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (zip 17466 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bal, A.B., Mounir, R., Aakur, S., Sarkar, S., Srivastava, A. (2022). Bayesian Tracking of Video Graphs Using Joint Kalman Smoothing and Registration. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13695. Springer, Cham. https://doi.org/10.1007/978-3-031-19833-5_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-19833-5_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-19832-8

  • Online ISBN: 978-3-031-19833-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics