Abstract
Collective behaviors characterize the intrinsic dynamics of the crowds. Automatically understanding collective crowd behaviors has important applications to video surveillance, traffic management and crowd control, while it is closely related to scientific fields such as statistical physics and biology. In this paper, a new mixture model of dynamic pedestrian-Agents (MDA) is proposed to learn the collective behavior patterns of pedestrians in crowded scenes from video sequences. From agent-based modeling, each pedestrian in the crowd is driven by a dynamic pedestrian-agent, which is a linear dynamic system with initial and termination states reflecting the pedestrian’s belief of the starting point and the destination. The whole crowd is then modeled as a mixture of dynamic pedestrian-agents. Once the model parameters are learned from the trajectories extracted from videos, MDA can simulate the crowd behaviors. It can also infer the past behaviors and predict the future behaviors of pedestrians given their partially observed trajectories, and classify them different pedestrian behaviors. The effectiveness of MDA and its applications are demonstrated by qualitative and quantitative experiments on various video surveillance sequences.















Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
Datasets, demo videos and related materials are available from http://mmlab.ie.cuhk.edu.hk/project/dynamicagent/
References
Ali, S., Shah, M. (2007). A lagrangian particle dynamics approach for crowd flow segmentation and stability analysis. In Proceedinds CVPR.
Ali, S., Shah, M. (2008). Floor fields for tracking in high density crowd scenes. In Proceedings ECCV.
Antonini, G., Martinez, S., Bierlaire, M., & Thiran, J. (2006). Behavioral priors for detection and tracking of pedestrians in video sequences. International Journal of Computer Vision, 69, 159–180.
Ball, P. (2004). Critical mass: How one thing leads to another. New York: Farrar Straus & Giroux.
Van den Berg, J., Lin, M., Manocha, D. (2008). Reciprocal velocity obstacles for real-time multi-agent navigation. In Proceedings ICRA.
Berg, J., Lin, M., Manocha, D. (2008). Reciprocal velocity obstacles for real-time multi-agent navigation. In Proceedings of IEEE International Conference on Robotics and Automation.
Bonabeau, E. (2002). Agent-based modeling: Methods and techniques for simulating human systems. PNAS, 14, 7280–7287.
Chan, A. B., & Vasconcelos, N. (2008). Modeling, clustering, and segmenting video with mixtures of dynamic textures. IEEE Transactions on PAMI, 30, 909–926.
Chang, M., Krahnstoever, N., Ge, W. (2011). Probabilistic group-level motion analysis and scenario recognition. In Proceedings ICCV.
Choi, W., Shahid, K., Savarese, S. (2011). Learning context for collective activity recognition. In Proceedings CVPR.
Couzin, I. (2009). Collective cognition in animal groups. Trends in Cognitive Sciences, 13(1), 36–43.
Couzin, I., Krause, J., James, R., Ruxton, G., & Franks, N. (2002). Collective memory and spatial sorting in animal groups. Journal of Theoretical Biology, 218(1), 1–11.
Doretto, G., & Chiuso, A. (2003). Dynamic textures. International Journal of Computer Vision, 51, 91–109.
Emonet, R., Varadarajan, J., Odobez, J. (2011). Extracting and locating temporal motifs in video scenes using a hierarchical non parametric bayesian model. In Proceedings CVPR.
Forsyth, D. (2009). Group dynamics. Belmont: Wadsworth Pub Co.
Ge, W., Collins, R., & Ruback, R. (2011). Vision-based analysis of small groups in pedestrian crowds. IEEE Transactions on PAMI, 34(5), 1003–1016.
Helbing, D., & Molnar, P. (1995). Social force model for pedestrian dynamics. Physical Review E, 51(5), 4282–4286.
Helbing, D., Farkas, I., & Vicsek, T. (2000). Simulating dynamical features of escape panic. Nature, 407, 487–490.
Hospedales, T., Gong, S., Xiang, T. (2009). A markov clustering topic model for mining behaviour in video. In Proceedings ICCV.
Hospedales, T., Li, J., Gong, S., & Xiang, T. (2011). Identifying rare and subtle behaviours: A weakly supervised joint topic model. IEEE Transactions on PAMI, 33(12), 2451–2464.
Hu, W., Tan, T., Wang, L., & Maybank, S. (2004). A survey on visual surveillance of object motion and behaviors. IEEE Transactions on SMC - Part C, 34(3), 334–352.
Hu, W., Xie, D., Fu, Z., Zeng, W., & Maybank, S. (2007). Semantic-based surveillance video retrieval. IEEE Transactions on Image Processing, 16(4), 1168–1181.
Hughes, R. (2003). The flow of human crowds. Annual Review of Fluid Mechanics, 35(1), 169–182.
Kaucic, R., Perera, A., Brooksby, G., Kaufhold, J., Hoogs, A. (2005). A unified framework for tracking through occlusions and across sensor gaps. In Proceedings CVPR.
Kim, K., Lee, D., Essa, I. (2011). Gaussian process regression flow for analysis of motion trajectories. In Proceedings ICCV.
Kingman, J. (1993). Poisson processes. Oxford: Oxford University Press.
Kratz, L., Nishino, K. (2009). Anomaly detection in extremely crowded scenes using spatio-temporal motion pattern models. In Proceedings CVPR.
Kuettel, D., Breitenstein, M., Van Gool, L., Ferrari, V. (2010). What’s going on? discovering spatio-temporal dependencies in dynamic scenes. In Proceedings CVPR.
Lan, T., Wang, Y., Yang, W., Robinovitch, S. N., & Mori, G. (2011). Discriminative latent models for recognizing contextual group activities. IEEE Transactions on PAMI, 34(8), 1549–1562.
Lan, T., Sigal, L., Mori, G. (2012). Social roles in hierarchical models for human activity recognition. In Proceedings CVPR.
Le Bon, G. (1897). The crowd: A study of the popular mind. New York: The Macmillan Co.
Li, J., Gong, S., Xiang, T. (2008). Scene segmentation for behaviour correlation. In Proceedings ECCV.
Lin, D., Grimson, E., Fisher, J. (2009). Learning visual flows: A Lie algebraic approach. In Proceedings CVPR.
Lin, D., Grimson, E., Fisher, J. (2010). Modeling and estimating persistent motion with geometric flows. In Proceedings CVPR.
Loy, C., Xiang, T., Gong, S. (2009). Multi-camera activity correlation analysis. In Proceedings CVPR.
Mahadevan, V., Li, W., Bhalodia, V., Vasconcelos, N. (2010). Anomaly detection in crowded scenes. In Proceedings CVPR.
Makris, D., & Ellis, T. (2005). Learning semantic scene models from observing activity in visual surveillance. IEEE Transactions on SMC - Part B, 35(3), 397–408.
Mehran, R., Oyama, A., Shah, M. (2009). Abnormal crowd behavior detection using social force model. In Proceedings CVPR.
Mehran, R., Moore, B., Shah, M. (2010). A streakline representation of flow in crowded scenes. In Proceedings ECCV.
Moberts, B., Vilanova, A., Jake, J.W. (2005). Evaluation of fiber clustering methods for diffusion tensor imaging. In Proceedings of IEEE Visualization.
Morris, B. T., & Trivedi, M. M. (2008). A survey of vision-based trajectory learning and analysis for surveillance. IEEE Transactions on CSVT, 18, 1114–1127.
Morris, T. B., & Trvedi, M. M. (2011). Trajectory learning for activity understanding: Unsupervised, multilevel, and long-term adaptive approach. IEEE Transactions on PAMI, 33, 2287–2301.
Moussaid, M., Garnier, S., Theraulaz, G., & Helbing, D. (2009). Collective information processing and pattern formation in swarms, flocks, and crowds. Topics in Cognitive Science, 1(3), 469–497.
Moussaid, M., Perozo, N., Garnier, S., Helbing, D., & Theraulaz, G. (2010). The walking behaviour of pedestrian social groups and its impact on crowd dynamics. PLoS One, 5(4), e10047.
Moussaid, M., Helbing, D., & Theraulaz, G. (2011). How simple rules determine pedestrian behavior and crowd disasters. PNAS, 108, 6884–6888.
Oh, S.M., Rehg, J.M., Balch, T., Dellaert. F. (2005). Learning and inference in parametric switching linear synamic systems. In Proceedings ICCV.
Palma, W. (2007). Long-memory time series: Theory and methods. Hoboken: Wiley-Blackwell.
Parrish, J., & Edelstein-Keshet, L. (1999). Complexity, pattern, and evolutionary trade-offs in animal aggregation. Science, 284, 99– 101.
Patil, S., Van Den Berg, J., Curtis, S., Lin, M. C., & Manocha, D. (2011). Directing crowd simulations using navigation fields. IEEE Transactions on Visualization and Computer Graphics, 17(2), 244–254.
Pavlovic, V., Frey, B., Huang, T. (1999). Time-series classification using mixed-state dynamic bayesian networks. In Proceedings CVPR.
Pellegrini, S., Ess, A., Schindler, K., Van Gool, L. (2009). You’ll never walk alone: Modeling social behavior for multi-target tracking. In Proceedings ICCV.
Pellegrini, S., Ess, A., Van Gool, L. (2010). Improving data association by joint modeling of pedestrian trajectories and groupings. In Proceedings ECCV.
Rand, W. M. (1971). Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association, 66, 846–850.
Rodriguez, M., Ali, S., Kanade, T. (2009). Tracking in unstructured crowded scenes. In Proceedings ICCV.
Rodriguez, M., Sivic, J., Laptev, I., Audibert, J. (2011). Data-driven crowd analysis in videos. In Proceedings ICCV.
Saleemi, I., Hartung, L., Shah, M. (2010). Scene understanding by statistical modeling of motion patterns. In Proceedings CVPR.
Saligrama, V., Chen, Z. (2012). Video anomaly detection based on local statistical aggregates. In Proceedings CVPR.
Schneider, P., & Eberly, D. H. (2003). Geometric Tools for Computer Graphics. San Francisco: Morgan Kaufmann.
Scovanner, P., Tappen, M. (2009). Learning pedestrian dynamics from the real world. In Proceedings ICCV.
Shumway, R., & Stoffer, D. (1982). An approach to time series smoothing and forecasting using the EM algorithm. Journal of time series analysis, 3(4), 253–264.
Stauffer, C. (2003). Estimating tracking sources and sinks. In Proceedings CVPR Workshop.
Tomasi, C., Kanade, T. (1991). Detection and Tracking of Point Features. International Journal of Computer Vision.
Treuille, A., Cooper, S., & Popović, Z. (2006). Continuum crowds. ACM SIGGRAPH, 25(3), 1160–1168.
Vicsek, T., Czirók, A., Ben-Jacob, E., Cohen, I., & Shochet, O. (1995). Novel type of phase transition in a system of self-driven particles. Physical Review Letters, 75(6), 1226–1229.
Wang, M., Wang, X. (2011). Automatic adaptation of a generic pedestrian detector to a specific traffic scene. In Proceedings CVPR.
Wang, X., Tieu, K., Grimson, W. (2006). Learning semantic scene models by trajectory analysis. In Proceedings ECCV.
Wang, X., Ma, K., Ng, G., Grimson, W. (2008a). Trajectory analysis and semantic region modeling using a nonparametric bayesian model. In Proceedings CVPR.
Wang, X., Ma, X., & Grimson, W. (2008b). Unsupervised activity perception in crowded and complicated scenes using hierarchical bayesian models. IEEE Transactions on PAMI, 31(3), 539–555.
Wang, X., Ma, K., Ng, G., & Grimson, W. (2011). Trajectory analysis and semantic region modeling using nonparametric hierarchical bayesian models. International Journal of Computer Vision, 95(3), 287–312.
Wu, S., Moore, B.E., Shah, M. (2010). Chaotic invariants of lagrangian particle trajectories for anomaly detection in crowded scenes. In Proceedings CVPR.
Yamaguchi, K., Berg, A.C., Ortiz, L.E., Berg, T.L. (2011). Who are you with and where are you going? In Proceedings CVPR.
Yang, Y., Liu, J., Shah, M. (2009). Video scene understanding using multi-scale analysis. In Proceedings ICCV.
Zen, G., Ricci, E. (2011). Earth mover’s prototypes: a convex learning approach for discovering activity patterns in dynamic scenes. In Proceedings ICCV.
Zhao, X., Medioni, G. (2011). Robust unsupervised motion pattern inference from video and application. In Proceedings ICCV.
Zhou, B., Wang, X., Tang, X. (2011). Random field topic model for semantic region analysis in crowded scenes from tracklets. In Proceedings CVPR.
Zhou, B., Tang, X., Wang, X. (2012a). Coherent filtering: detecting coherent motions from crowd clutters. In Proceedings ECCV.
Zhou, B., Wang, X., Tang, X. (2012b). Understanding collective crowd behaviors: Learning a mixture model of dynamic pedestrian-agents. In Proceedings CVPR.
Zhou, B., Tang, X., Wang, X. (2013). Measuring crowd collectiveness. In Proceedings CVPR.
Zhou, B., Tang, X., Zhang, H., Wang, X. (2014). Measuring crowd collectiveness. IEEE Transactions on PAMI.
Zhou, S., Chen, D., Cai, W., Lyo, L., Yoke, M., Hean, L., et al. (2010). Crowd modeling and simulation technologies. ACM Transactions on Modeling and Computer Simulation, 20(4), 20.
Acknowledgments
This work is partially supported by the General Research Fund sponsored by the Research Grants Council of Hong Kong (Project No. CUHK417110, CUHK417011, and CUHK 429412).
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by M. Hebert.
Rights and permissions
About this article
Cite this article
Zhou, B., Tang, X. & Wang, X. Learning Collective Crowd Behaviors with Dynamic Pedestrian-Agents. Int J Comput Vis 111, 50–68 (2015). https://doi.org/10.1007/s11263-014-0735-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-014-0735-3