Skip to main content
Log in

Motion sketch based crowd video retrieval

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Crowd video retrieval with desired motion flow segmentation is an important problem in surveillance video management, e.g., video indexing and browsing, especially in the age of big data. In this paper, we address this issue from the motion-level perspective by using hand-drawn sketches as queries. Motion sketch based crowd video retrieval naturally suffers from challenges in crowd motion representation and similarity measurement. To tackle them, we propose to (1) leverage the motion structure coding algorithm for motion-level video indexing and hand-drawn sketch representation and (2) exploit distance metric fusion strategy incorporated with Ranking SVM for measuring the relevant degree between a sketch query and crowd videos. Specifically, for video indexing, motion decomposition is utilized to separate sub-motion vector fields with typical patterns from a set of optical flows. Then, the motion-level descriptors of the vector fields are computed and stored in an index database. To represent motion sketches, we propose a mechanism by vectorizing the sketches followed by motion structure coding. In the retrieval stage, we first compute the pairwise distance with different metrics between a new sketch query and crowd videos, and then stack them into a feature vector as the input of the Ranking SVM algorithm. Finally, we use the learned retrieval model to predict the ranking score of each crowd video in the database. Experimental results on the publicly available crowd datasets show the robustness and effectiveness of the proposed sketch based crowd video retrieval system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  1. Ali S (2013) Measuring flow complexity in videos. In: IEEE international conference on computer vision, pp 1097–1104

  2. AUC (2017) http://www.mathwords.com/a/area_under_a_curve.htm

  3. Bai Y, Tang M (2012) Robust tracking via weakly supervised ranking SVM. In: IEEE conference on computer vision and pattern recognition, pp 1854–1861

  4. Bashir FI, Khokhar AA, Schonfeld D (2007) Real-time motion trajectory-based indexing and retrieval of video sequences. IEEE Trans Multimed 9(1):58–65

    Article  Google Scholar 

  5. Cao Y, Xu J, Liu T, Li H, Huang Y, Hon H (2006) Adapting ranking SVM to document retrieval. In: Proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval, pp 186–193

  6. Cao Z, Qin T, Liu T, Tsai M, Li H (2007) Learning to rank: from pairwise approach to listwise approach. In: International conference on machine learning, pp 129–136

  7. Chen X, Schonfeld D, Khokhar AA (2008) Robust null space representation and sampling for view-invariant motion trajectory analysis. In: IEEE conference on computer vision and pattern recognition, pp 1–6

  8. Chen X, Schonfeld D, Khokhar AA (2009) View-invariant tensor null-space representation for multiple motion trajectory retrieval and classification. In: IEEE international conference on acoustics, speech, and signal processing, pp 3545–3548

  9. Chen X, Schonfeld D, Khokhar AA (2010) Non-linear kernel space invariant representation for view-invariant motion trajectory retrieval and classification. In: IEEE international conference on acoustics, speech, and signal processing, pp 5582–5585

  10. Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machchine Intelligence 24(5):603–619

    Article  Google Scholar 

  11. Dyana A, Das S (2010) MST-CSS (Multi-spectro-temporal curvature scale space), a novel spatio-temporal representation for content-based video retrieval. IEEE Transations on Circuits and Systems for Video Technology 20(8):1080–1094

    Article  Google Scholar 

  12. Feng B, Cao J, Lin S, Zhang Y, Tao K (2009) Motion region-based trajectory analysis and re-ranking for video retrieval. In: IEEE international conference on multimedia and expo, pp 378–381

  13. Herbrich R, Graepel T, Obermayer K (1999) Large margin rank boundaries for ordinal regression. In: Advances in large margin classifiers, chap 7. The MIT press, pp 115–132

  14. Hsieh J, Yu S, Chen Y (2006) Motion-based video retrieval by trajectory matching. IEEE Transactions on Circuits and Systems for Video Technology 16(3):396–409

    Article  Google Scholar 

  15. Hu W, Xie D, Fu Z, Zeng W, Maybank SJ (2007) Semantic-based surveillance video retrieval. IEEE Trans Image Process 16(4):1168–1181

    Article  MathSciNet  Google Scholar 

  16. Hu W, Xie N, Li L, Zeng X, Maybank S J (2011) A survey on visual content-based video indexing and retrieval. IEEE Trans Syst Man Cybern Part C 41(6):797–819

    Article  Google Scholar 

  17. Hu W, Li X, Tian G, Maybank SJ, Zhang Z (2013) An incremental dpmm-based method for trajectory clustering, modeling, and retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence 35(5):1051–1065

    Article  Google Scholar 

  18. Hu Y, Li M, Yu N (2008) Multiple-instance ranking: learning to rank images for image retrieval. In: IEEE conference on computer vision and pattern recognition

  19. Jiang X, Hu Y, Li H (2009) A ranking approach to keyphrase extraction. In: Proceedings of the 32nd annual international ACM SIGIR conference on research and development in information retrieval, pp 756–757

  20. Joachims T (2002) Optimizing search engines using clickthrough data. In: ACM SIGKDD conference on knowledge discovery and data mining, pp 133–142

  21. Joachims T (2006) Training linear svms in linear time. In: ACM SIGKDD conference on knowledge discovery and data mining, pp 217–226

  22. Kang K, Wang X (2014) Fully convolutional neural networks for crowd segmentation. coRR arXiv:1411.4464

  23. Kratz L, Nishino K (2012) Tracking pedestrians using local spatio-temporal motion patterns in extremely crowded scenes. IEEE Transactions on Pattern Analysis and Machchine Intelligence 34(5):987–1002

    Article  Google Scholar 

  24. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: 26th annual conference on neural information processing systems, pp 1106–1114

  25. Lan T, Yang W, Wang Y, Mori G (2012) Image retrieval with structured object queries using latent ranking SVM. In: European conference on computer vision, pp 129–142

  26. Li T, Chang H, Wang M, Ni B, Hong R, Yan S (2015) Crowded scene analysis: a survey. IEEE Transactions on Circuits and Systems for Video Technology 25(3):367–386

    Article  Google Scholar 

  27. Lucas BD, Kanade T (1981) An iterative image registration technique with an application to stereo vision. In: International joint conference on artificial intelligence, pp 674–679

  28. Mehran R, Oyama A, Shah M (2009) Abnormal crowd behavior detection using social force model. In: IEEE conference on computer vision and pattern recognition, pp 935–942

  29. Mehran R, Moore BE, Shah M (2010) A streakline representation of flow in crowded scenes. In: European conference on computer vision, pp 439–452

  30. Presti LL, Sclaroff S, Cascia ML (2012) Path modeling and retrieval in distributed video surveillance databases. IEEE Trans Multimed 14(2):346–360

    Article  Google Scholar 

  31. Qu W, Bashir F, Graupe D, Khokhar A, Schonfeld D (2005) A motion trajectory based video retrieval system using parallel adaptive self organizing maps. In: IEEE international joint conference on neural networks, vol. 3, pp 1800–1805

  32. Shao J, Loy CC, Wang X (2014) Scene-independent group profiling in crowd. In: IEEE conference on computer vision and pattern recognition, pp 2227–2234

  33. Shao J, Kang K, Loy CC, Wang X (2015) Deeply learned attributes for crowded scene understanding. In: IEEE conference on computer vision and pattern recognition, CVPR 2015, Boston, pp 4657–4666

  34. Solmaz B, Moore BE, Shah M (2012) Identifying behaviors in crowd scenes using stability analysis for dynamical systems. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(10):2064–2070

    Article  Google Scholar 

  35. Su C, Liao HM, Tyan H, Lin C, Chen D, Fan K (2007) Motion flow-based video retrieval. IEEE Trans Multimed 9(6):1193–1201

    Article  Google Scholar 

  36. Su H, Yang H, Zheng S, Fan Y, Wei S (2013) The large-scale crowd behavior perception based on spatio-temporal viscous fluid field. IEEE Trans Inf Forensic Secur 8(10):1575–1589

    Article  Google Scholar 

  37. Weinberger KQ, Saul LK (2009) Distance metric learning for large margin nearest neighbor classification. J Mach Learn Res 10:207–244

    MATH  Google Scholar 

  38. Wu S, Moore BE, Shah M (2010) Chaotic invariants of lagrangian particle trajectories for anomaly detection in crowded scenes. In: IEEE conference on computer vision and pattern recognition, pp 2054–2060

  39. Wu S, Su H, Zheng S, Yang H, Zhou Q (2016) Motion sketch based crowd video retrieval via motion structure coding. In: IEEE international conference on image processing

  40. Wu S, Yang H, Zheng S, Su H, Fan Y, Yang MH (2017) Crowd behavior analysis via curl and divergence of motion trajectories. Int J Comput Vis, Accepted.

  41. Xing EP, Ng AY, Jordan MI, Russell SJ (2002) Distance metric learning with application to clustering with side-information. In: Advances in neural information processing systems, pp 505–512

  42. Xu J, Cao Y, Li H, Huang Y (2006) Cost-sensitive learning of SVM for ranking. In: European conference on machine learning, pp 833–840

  43. YouKu (2017) http://www.youku.com

  44. YouTube (2017) https://www.youtube.com

  45. Zhang C, Li H, Wang X, Yang X (2015) Cross-scene crowd counting via deep convolutional neural networks. In: IEEE conference on computer vision and pattern recognition, pp 833–841

  46. Zhang Y, Zhou D, Chen S, Gao S, Ma Y (2016) Single-image crowd counting via multi-column convolutional neural network. In: IEEE conference on computer vision and pattern recognition, pp 1–9

  47. Zhang Z, Warrell J, Torr PHS (2011) Proposal generation for object detection using cascaded ranking svms. In: IEEE conference on computer vision and pattern recognition, pp 1497–1504

  48. Zhou B, Tang X, Wang X (2013) Measuring crowd collectiveness. In: IEEE conference on computer vision and pattern recognition, pp 3049–3056

Download references

Acknowledgments

This work was supported in part by National Natural Science Foundation of China (NSFC, Grant Nos. 61671289, 61171172, 61102099, 61571261 and 61521062) and Science and Technology Commission of Shanghai Municipality (STCSM, Grant Nos. 15DZ1207403 and 12DZ2272600).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hua Yang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, S., Yang, H., Zheng, S. et al. Motion sketch based crowd video retrieval. Multimed Tools Appl 76, 20167–20195 (2017). https://doi.org/10.1007/s11042-017-4568-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-017-4568-2

Keywords

Navigation