Skip to main content
Log in

An Improved Hierarchical Dirichlet Process-Hidden Markov Model and Its Application to Trajectory Modeling and Retrieval

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

In this paper, we propose a hierarchical Bayesian model, an improved hierarchical Dirichlet process-hidden Markov model (iHDP-HMM), for visual document analysis. The iHDP-HMM is capable of clustering visual documents and capturing the temporal correlations between the visual words within a visual document while identifying the number of document clusters and the number of visual topics adaptively. A Bayesian inference mechanism for the iHDP-HMM is developed to carry out likelihood evaluation, topic estimation, and cluster membership prediction. We apply the iHDP-HMM to simultaneously cluster motion trajectories and discover latent topics for trajectory words, based on the proposed method for constructing the trajectory word codebook. Then, an iHDP-HMM-based probabilistic trajectory retrieval framework is developed. The experimental results verify the clustering accuracy of the iHDP-HMM and trajectory retrieval accuracy of the proposed framework.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23

Similar content being viewed by others

References

  • Alon, J., Sclaroff, S., Kollios, G., Pavlovic, V. (2003). Discovering clusters in motion time-series data. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (Vol. 1, pp. 375–381).

  • Atev, S., Miller, G., & Papanikolopoulos, N. P. (2010). Clustering of vehicle trajectories. IEEE Transactions on Intelligent Transportation Systems, 11(3), 647–657.

    Article  Google Scholar 

  • Bashir, F., Khokhar, A., Schonfeld, D. (2004). A hybrid system for affine-invariant trajectory retrieval. In Proceedings of ACM SIGMM International Workshop on Multimedia Information Retrieval (pp. 235–242).

  • Bashir, F. I., Khokhar, A. A., & Schonfeld, D. (2007). Real-time motion trajectory-based indexing and retrieval of video sequences. IEEE Transactions on Multimedia, 9(1), 58–65.

    Article  Google Scholar 

  • Beal, M.J., Ghahramani, Z., Rasmussen, C. (2002). The infinite hidden Markov model. In Proceedings of Annual Conference on Neural Information Processing Systems (Vol. 14, pp. 577–584).

  • Beal, M.J., Krishnamurthy, P. (2006). Gene expression time course clustering with countably infinite hidden Markov models. In Proceedings of Annual Conference on Uncertainty in Artificial Intelligence (pp. 23–30).

  • Blackwell, D., & Macqueen, J. B. (1973). Ferguson distribution via polya urn schemes. The Annals of Statistics, 1(2), 353–355.

    Article  MathSciNet  MATH  Google Scholar 

  • Blei, D.M., Jordan, M.I. (2004). Variational methods for the Dirichlet process. In Proceedings of International Conference on Machine Learning (pp. 121–144).

  • Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.

    MATH  Google Scholar 

  • Chen L., Ozsu M.T., Oria V. (2004). Symbolic representation and retrieval of moving object trajectories. In Proceedings of ACM SIGMM International Workshop on Multimedia Information Retrieval (pp. 227–234).

  • Chen, L., Ozsu, M.T., Oria, V. (2005). Robust and fast similarity search for moving object trajectories. In Proceedings of ACM International Conference on Management of Data (pp. 491–502).

  • Comaniciu, D., & Meer, P. (2002). Mean shift: A robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(5), 603–619.

    Article  Google Scholar 

  • Dimitrova, N., & Golshani, F. (1995). Motion recovery for video content classification. ACM Transactions on Information System, 13(14), 408–439.

    Article  Google Scholar 

  • Dyana, A., Das, S. (2007). Spatio-temporal descriptor using 3D curvature scale space. In Proceedings of International Conference on Pattern Recognition and Machine Intelligence (pp. 632–640).

  • Dyana, A., Subramanian, M.P., Das, S. (2009). Combining reatures for shape and motion trajectory of video objects for efficient content based video retrieval. In Proceedings of International Conference on Advances in Pattern Recognition (pp. 113–116).

  • Dyana, A., & Das, S. (2010). MST-CSS (multi-spectro-temporal curvature scale space), a novel spatio-temporal representation for content-based video retrieval. IEEE Transactions on Circuits and Systems for Video Technology, 20(8), 1080–1094.

    Article  Google Scholar 

  • Ferguson, T. (1973). A Bayesian analysis of some non-parametric problems. The Annals of Statistics, 1(2), 209–230.

    Article  MathSciNet  MATH  Google Scholar 

  • Fox, E. B., Sudderth, E. B., Jordan, M. I., & Willsky, A. S. (2008). An HDP-HMM for systems with state persistence. In Proceedins of International Conference on Machine Learning (pp. 312–319). Finland: Helsinki.

  • Georgescu, B., Shimshoni, I., Meer, P. (2003). Mean shift based clustering in high dimensions: A texture classification example. In Proceedings of IEEE International Conference on Computer Vision (Vol. 1, pp. 456–463).

  • Hsieh, J., Yu, S., & Chen, Y. (2006). Motion-based video retrieval by trajectory matching. IEEE Transactions on Circuits and Systems for Video Technology, 16(3), 396–409.

    Article  Google Scholar 

  • Jian, Y.-D., & Chen, C.-S. (2010). Two-view motion segmentation with model selection and outlier removal by Ransac-enhanced Dirichlet process mixture models. International Journal of Computer Vision, 88(3), 489–501.

    Article  MathSciNet  Google Scholar 

  • Johnson, N., & Hogg, D. (1996). Learning the distribution of object trajectories for event recognition. Image and Vision Computing, 14(8), 609–615.

    Article  Google Scholar 

  • Jung, C. R., Hennemann, L., & Musse, S. R. (2008). Event detection using trajectory clustering and 4-D histograms. IEEE Transactions on Circuits and Systems for Video Technology, 18(11), 1565– 1575.

    Article  Google Scholar 

  • Keogh, E.J., Pazzani, M.J. (2000). Scaling up dynamic time warping for datamining applications. In Proceedings of International Conference on Knowledge Discovery and Data Mining (pp. 285–289).

  • Kivinen, J.J., Sudderth, E.B., Jordan, M.I. (2007). Learning multiscale representations of natural scenes using Dirichlet processes. In Proceedings of IEEE International Conference on Computer Vision (pp. 1–8).

  • Kuettel, D., Breitenstein, M.D., Gool, L.V., Ferrari, V. (2010). What’s going on? Discovering spatio-temporal dependencies in dynamic scenes. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 1951–1958).

  • Le, T.-L., Boucher, A., Thonnat, M. (2006). Trajectory-based video indexing and retrieval enabling relevance feedback. In Proceedings of International Conference on Communications and Electronics (pp. 1–6).

  • Le, T.-L., Boucher, A., Thonnat, M. (2007). Subtrajectory-based video indexing and retrieval. In Proceedings of International Multimedia Modeling Conference (pp. 418–427). Singapore.

  • Li, X., Hu, W.M., Zhang, Z.F., Zhang, X.Q., Luo, G. (2008). Trajectory-based video retrieval using Dirichlet process mixture models. In Proceedings of British Machine Vision Conference (pp. 1–10). UK: Leeds.

  • Li, F.-F., Perona, P. (2005). A Bayesian hierarchical model for learning natural scene categories. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (Vol. 2, pp. 524–531).

  • Li, L., Wang, G., Li, F.-F. (2007) OPTIMOL: Automatic online picture collection via incremental model learning. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 1–8).

  • Linde, Y., Buzo, A., & Gray, R. (1980). An algorithm for vector quantizer design. IEEE Transactions on Communications, 28(1), 84–95.

    Article  Google Scholar 

  • Little, J. J., Gu, Z. (2001). Video retrieval by spatial and temporal structure of trajectories. In Proceedings SPIE Storage and Retrieval for Media Databases (Vol. 4315, pp. 545–552).

  • Liu, C.-L., Zhou, X.-D. (2006). Online Japanese character recognition using trajectory-based normalization and direction feature extraction. In Proceedings of International Workshop on Frontiers in Handwriting Recognition (pp. 217–222). France: La Baule.

  • Liu, C.-L., Jaeger, S., & Nakagawa, M. (2004). Online recognition of Chinese characters: The state-of-the-art. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(2), 198–213.

    Article  Google Scholar 

  • Ma, X., Bashir, F., Khokhar, A. A., & Schonfeld, D. (2009). Event analysis based on multiple interactive motion trajectories. IEEE Transactions on Circuits and Systems for Video Technology, 19(3), 397–406.

    Article  Google Scholar 

  • Maceachern, S. N., & Muller, P. (1998). Estimating mixture of Dirichlet process models. Journal of Computational and Graphical Statistics, 7(2), 223–238.

    Google Scholar 

  • Morris, B.T., Trivedi, M.M. (2009). Learning trajectory patterns by clustering: Experimental studies and comparative evaluation. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 312–319).

  • Morris, B. T., & Trivedi, M. M. (2008). Learning, modeling, and classification of vehicle track patterns from live video. IEEE Transactions on Intelligent Transportation Systems, 9(3), 425–437.

    Article  Google Scholar 

  • Morris, B. T., & Trivedi, M. M. (2008). A survey of vision-based trajectory learning and analysis for surveillance. IEEE Transactions on Circuits and Systems for Video Technology, 18(8), 1114–1127.

    Article  Google Scholar 

  • Naftel, A., Khalid, S. (2006). Motion trajectory learning in the DFT-coefficient feature space. In Proceedings of IEEE International Conference on Computer Vision Systems (pp. 47–47), Jan 2006.

  • Neal, R. (2000). Markov chain sampling methods for Dirichlet process mixture models. Journal of Computational and Graphical Statistics, 9(2), 249–265.

    MathSciNet  Google Scholar 

  • Niebles, J., Wang, H. C., & Li, F.-F. (2008). Unsupervised learning of human action categories using spatial-temporal words. International Journal of Computer Vision, 79(3), 299–318.

    Article  Google Scholar 

  • Piotto, N., Conci, N., & De Natale, F. G. B. (2009). Syntactic matching of trajectories for ambient intelligence applications. IEEE Transactions on Multimedia, 11(7), 1266–1275.

    Article  Google Scholar 

  • Sahouria, E. (1997). Video Indexing Based on Object Motion. M.S. Thesis, Department of Electrical Engineering and Computer Science, University of California, Berkeley.

  • Saleemi, I., Shafique, K., & Shah, M. (2009). Probabilistic modeling of scene dynamics for applications in visual surveillance. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(8), 1472–1485.

    Article  Google Scholar 

  • Shim, C.-B., Chang, J.-W. (2000). Spatio-temporal representation and retrieval using moving object’s trajectories. In Proceedings of ACM Workshops on Multimedia (pp. 209–212).

  • Sivic, J., Russell, B.C., Efros, A.A., Zisserman, A., Freeman, W.T. (2005). Discovering objects and their location in images. In Proceedings of IEEE International Conference on Computer Vision (Vol. 1, pp. 370–377).

  • Sun, J., Zhang, W., Tang, X., Shum, H. (2005). Bidirectional tracking using trajectory segment analysis. In Proceedings of IEEE International Conference on Computer Vision (Vol. 1, pp. 717–724).

  • Teh, Y.W., Jordan, M.I., Beal, M.J., Blei, D.M. (2005). Sharing clusters among related groups: Hierarchical Dirichlet processes. In Proceedings of Annual Conference on Neural Information Processing Systems (pp. 1385–1392).

  • Teh, Y., Jordan, M., Beal, M., & Blei, D. (2006). Hierarchical Dirichlet processes. Journal of the American Statistical Association, 101(476), 1566–1581.

    Article  MathSciNet  MATH  Google Scholar 

  • Veeraraghavan, H., & Papanikolopoulos, N. P. (2009). Learning to recognize video-based spatiotemporal events. IEEE Transactions on Intelligent Transportation Systems, 10(4), 628–638.

    Article  Google Scholar 

  • Vlachos, M., Kollios, G., Gunopulos, D. (2002). Discovering similar multidimensional trajectories. In Proceedings of International Conference on Data Engineering (pp. 673–684).

  • Vlachos, M., Hadjieleftheriou, M., Gunopulos, D., & Keogh, E. (2006). Indexing multidimensional time-series. International Journal on Very Large Data Bases, 15(1), 1–20.

    Article  Google Scholar 

  • Wang, X., Grimson, E. (2007). Spatial latent Dirichlet allocation. In Proceedings of Annual Conference on Neural Information Processing Systems (pp. 1–8).

  • Wang, X., Ma, X., Grimson, E. (2007). Unsupervised activity perception by hierarchical Bayesian models. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 1–8).

  • Wang, X., Tieu, K., Grimson, E. (2006). Learning semantic scene models by trajectory analysis. In Proceedings of European Conference on Computer Vision (Vol. 3, pp. 110–123).

  • Wang, G., Zhang, Y., Li, F.-F. (2006). Using dependent regions for object categorization in a generative framework. In Proceedings of Computer Vision and Pattern Recognition (Vol. 2, pp. 1597–1604).

  • Zhang, Z., Huang, K., Tan, T. (2006). Comparison of similarity measures for trajectory clustering in outdoor surveillance scenes. In Proceedings of IEEE International Conference on Pattern Recognition (pp. 1135–1138).

  • Zhang, C., Zhu, S., Gong, Y. (2006). Trend analysis for large document streams. In Proceedings of International Conference on Machine Learning and Applications (pp. 285–295).

  • Zhu, X., Ghahramani, Z., Lafferty, J. (2005). Time-sensitive Dirichlet process mixture models. Technical Report CMUCALD-05-104, School of Computer Science, Carnegie Mellon University.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xi Li.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hu, W., Tian, G., Li, X. et al. An Improved Hierarchical Dirichlet Process-Hidden Markov Model and Its Application to Trajectory Modeling and Retrieval. Int J Comput Vis 105, 246–268 (2013). https://doi.org/10.1007/s11263-013-0638-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-013-0638-8

Keywords

Navigation