SurvSurf: human retrieval on large surveillance video data

Ding, Sihao; Li, Gang; Li, Ying; Li, Xinfeng; Zhai, Qiang; Champion, Adam C.; Zhu, Junda; Xuan, Dong; Zheng, Yuan F.

doi:10.1007/s11042-016-3307-4

SurvSurf: human retrieval on large surveillance video data

Published: 13 February 2016

Volume 76, pages 6521–6549, (2017)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Sihao Ding¹,
Gang Li²,
Ying Li¹,
Xinfeng Li²,
Qiang Zhai²,
Adam C. Champion²,
Junda Zhu³,
Dong Xuan² &
…
Yuan F. Zheng¹

666 Accesses
12 Citations
Explore all metrics

Abstract

The volume of surveillance videos is increasing rapidly, where humans are the major objects of interest. Rapid human retrieval in surveillance videos is therefore desirable and applicable to a broad spectrum of applications. Existing big data processing tools that mainly target textual data cannot be applied directly for timely processing of large video data due to three main challenges: videos are more data-intensive than textual data; visual operations have higher computational complexity than textual operations; and traditional segmentation may damage video data’s continuous semantics. In this paper, we design SurvSurf, a human retrieval system on large surveillance video data that exploits characteristics of these data and big data processing tools. We propose using motion information contained in videos for video data segmentation. The basic data unit after segmentation is called M-clip. M-clips help remove redundant video contents and reduce data volumes. We use the MapReduce framework to process M-clips in parallel for human detection and appearance/motion feature extraction. We further accelerate vision algorithms by processing only sub-areas with significant motion vectors rather than entire frames. In addition, we design a distributed data store called V-BigTable to structuralize M-clips’ semantic information. V-BigTable enables efficient retrieval on a huge amount of M-clips. We implement the system on Hadoop and HBase. Experimental results show that our system outperforms basic solutions by one order of magnitude in computational time with satisfactory human retrieval accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

The Data Platform for Large-Scale Video Surveillance Systems

A Large-Scale Distributed Video Parsing and Evaluation Platform

SVIS: Large Scale Video Data Ingestion into Big Data Platform

References

Apache Hadoop. http://hadoop.apache.org
Apache HBase. http://hbase.apache.org
Araujo A, Chaves J, Angst R, Girod B (2015) Temporal aggregation for large-scale query-by-image video retrieval. In: Proceedings of IEEE ICIP. IEEE, pp 1519–1522
Babu RV, Ramakrishnan K (2007) Compressed domain video retrieval using object and global motion descriptors. Multimed Tools Appl 32(1):93–113
Article Google Scholar
Bhattacharyya A (1946) On a measure of divergence between two multinomial populations. Sankhyā: the indian journal of statistics, pp 401–406
Candan KS, Kim JW, Nagarkar P, Nagendra M, Yu R (2011) Rankloud: scalable multimedia data processing in server clusters. IEEE Multimedia 18(1):64–77
Article Google Scholar
Chang F, Dean J, Ghemawat S, Hsieh WC, Wallach DA, Burrows M, Chandra T, Fikes A, Gruber RE (2008) Bigtable: a distributed storage system for structured data. ACM Trans Comput Syst 26(2): 4
Article Google Scholar
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: Proceedings of IEEE CVPR, vol 1. IEEE, pp 886–893
De Bruyne S, Van Deursen D, De Cock J, De Neve W, Lambert P, Van de Walle R (2008) A compressed-domain approach for shot boundary detection on h. 264/avc bit streams. Signal Process Image Commun 23(7):473–489
Article Google Scholar
Dean J, Ghemawat S (2008) Mapreduce: simplified data processing on large clusters. Commun. ACM 51(1):107–113
Article Google Scholar
Deng J, Berg AC, Fei-Fei L (2011) Hierarchical semantic indexing for large scale image retrieval. In: Proceedings of IEEE CVPR. IEEE, pp 785–792
Derpanis KG, Sizintsev M, Cannons K, Wildes RP (2010) Efficient action spotting based on a space-time-oriented structure representation. In: Proceedings IEEE CVPR. IEEE, pp 1990–1997
Doersch C, Singh S, Gupta A, Sivic J, Efros A (2012) What makes paris look like paris? ACM Trans Graphics 31(4):101
Article Google Scholar
Duan LY, Lin J, Chen J, Huang T, Gao W (2014) Compact descriptors for visual search. IEEE Multimedia 21(3):30–40
Article Google Scholar
Efros A (2012) What makes big visual data hard? http://bigdata.csail.mit.edu/node/68. [Online]
Enzweiler M, Gavrila DM (2009) Monocular pedestrian detection: survey and experiments. IEEE Trans Pattern Anal Mach Intell 31(12):2179–2195
Article Google Scholar
Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645
Article Google Scholar
Fernandez-Beltran R, Pla F (2016) Latent topics-based relevance feedback for video retrieval. Pattern Recogn 51:72–84
Article Google Scholar
Heikkinen A, Sarvanko J, Rautiainen M, Ylianttila M (2013) Distributed multimedia content analysis with mapreduce. In: 2013 IEEE 24th international symposium on personal indoor and mobile radio communications (PIMRC). IEEE, pp 3497–3501
Hu W, Tan T, Wang L, Maybank S (2004) A survey on visual surveillance of object motion and behaviors. IEEE Trans Syst, Man, Cybern C 34(3):334–352
Article Google Scholar
Hu W, Xie N, Li L, Zeng X, Maybank S (2011) A survey on visual content-based video indexing and retrieval. IEEE Trans Pattern Anal Mach Intell 41 (6):797–819
Google Scholar
Huang T (2014) Surveillance video: the biggest big data. IEEE Computer Society [Online] 7(2). http://www.computer.org/portal/web/computingnow/archive/february2014
International Data Corporation (2012) The Digital Universe in 2020: Big Data Bigger Digital Shadows, and Biggest Growth in the Far East. http://www.emc.com/leadership/digital-universe/iview/index.htm
Lai Yh, Yang C (2015) Video object retrieval by trajectory and appearance. IEEE Trans Circuits Systems Video Technology 25:1026–1037
Article Google Scholar
Mei S, Guan G, Wang Z, Wan S, He M, Feng DD (2015) Video summarization via minimum sparse reconstruction. Pattern Recogn 48(2):522–533
Article Google Scholar
Mullins J (2006) Ring of Steel II. http://spectrum.ieee.org/computing/hardware/ring-of-steel-ii
Over P, Awad G, Michel M, Fiscus J, Sanders G, Kraaij W, Smeaton AF, Quénot G (2014) Trecvid 2014- an overview of the goals, tasks, data, evaluation mechanisms and metrics
Ozer IB, Wolf W (2001) Human detection in compressed domain. In: Proceeding IEEE ICIP, vol 3. IEEE, pp 274–277
Riggs M (2013) Intense Smog Is Making Beijing’s Massive Surveillance Network Practically Useless. http://goo.gl/9mxG0J
Sadanand S, Corso JJ (2012) Action bank: a High-Level representation of activity in video. In: Proceeding IEEE CVPR. IEEE, pp 1234–1241
Sivic J, Everingham M, Zisserman A (2009) Who are you?”–Learning Person Specific Classifiers from Video. In: Proceeding IEEE CVPR. IEEE, pp 1145–1152
Sivic J, Zisserman A (2003) Video google: a text retrieval approach to object matching in videos. In: Proceeding IEEE ICCV. IEEE, pp 1470–1477
Torralba A, Fergus R, Freeman W (2008) 80 million tiny images: A large data set for nonparametric object and scene recognition. IEEE Trans Pattern Anal Mach Intell 30(11):1958–1970
Article Google Scholar
White B, Yeh T, Lin J, Davis L (2010) Web-scale Computer Vision using MapReduce for Multimedia Data Mining. In: Proceeding ACM MDMKDD, p 9
Yang MH, Kriegman D, Ahuja N (2002) Detecting faces in images: a survey. IEEE Trans Pattern Anal Mach Intell 24(1):34–58
Article Google Scholar
Zaharia M, Chowdhury M, Franklin MJ, Shenker S, Stoica I (2010) Spark: cluster computing with working sets. In: Proceedings of the 2nd USENIX conference on hot topics in cloud computing, pp 10–10
Zaharia M, Konwinski A, Joseph AD, Katz RH, Stoica I (2008) Improving mapreduce performance in heterogeneous environments. In: OSDI, vol 8, p 7

Download references

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, The Ohio State University, Columbus, OH, 43210, USA
Sihao Ding, Ying Li & Yuan F. Zheng
Department of Computer Science and Engineering, The Ohio State University, Columbus, OH, 43210, USA
Gang Li, Xinfeng Li, Qiang Zhai, Adam C. Champion & Dong Xuan
Department of Electrical and Computer Engineering, University of Macau, Macau, China
Junda Zhu

Authors

Sihao Ding
View author publications
You can also search for this author in PubMed Google Scholar
Gang Li
View author publications
You can also search for this author in PubMed Google Scholar
Ying Li
View author publications
You can also search for this author in PubMed Google Scholar
Xinfeng Li
View author publications
You can also search for this author in PubMed Google Scholar
Qiang Zhai
View author publications
You can also search for this author in PubMed Google Scholar
Adam C. Champion
View author publications
You can also search for this author in PubMed Google Scholar
Junda Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Dong Xuan
View author publications
You can also search for this author in PubMed Google Scholar
Yuan F. Zheng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sihao Ding.

Additional information

Sihao Ding and Gang Li are co-primary authors.

Dr. Junda Zhu’s research was supported in part by The Macau Science and Technology Development Fund under Grant FDCT 023/2013/A1, and University of Macau Research Council under Multi Year Research Grant.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ding, S., Li, G., Li, Y. et al. SurvSurf: human retrieval on large surveillance video data. Multimed Tools Appl 76, 6521–6549 (2017). https://doi.org/10.1007/s11042-016-3307-4

Download citation

Received: 17 July 2015
Revised: 22 December 2015
Accepted: 26 January 2016
Published: 13 February 2016
Issue Date: March 2017
DOI: https://doi.org/10.1007/s11042-016-3307-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

SurvSurf: human retrieval on large surveillance video data

Abstract

Access this article

Similar content being viewed by others

The Data Platform for Large-Scale Video Surveillance Systems

A Large-Scale Distributed Video Parsing and Evaluation Platform

SVIS: Large Scale Video Data Ingestion into Big Data Platform

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

SurvSurf: human retrieval on large surveillance video data

Abstract

Access this article

Similar content being viewed by others

The Data Platform for Large-Scale Video Surveillance Systems

A Large-Scale Distributed Video Parsing and Evaluation Platform

SVIS: Large Scale Video Data Ingestion into Big Data Platform

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation