GPU-based MapReduce for large-scale near-duplicate video retrieval

Wang, Hanli; Zhu, Fengkuangtian; Xiao, Bo; Wang, Lei; Jiang, Yu-Gang

doi:10.1007/s11042-014-2185-x

GPU-based MapReduce for large-scale near-duplicate video retrieval

Published: 01 August 2014

Volume 74, pages 10515–10534, (2015)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Hanli Wang¹,
Fengkuangtian Zhu¹,
Bo Xiao¹,
Lei Wang¹ &
…
Yu-Gang Jiang²

407 Accesses
10 Citations
Explore all metrics

Abstract

With the exponential growth of multimedia data, people are overwhelmed with massive amount of online videos, of which Near-Duplicate Videos (NDVs) occupy a large portion. In this paper, we present a novel framework for NDV retrieval, which explores the parallel power of two promising techniques: Graphics Processing Unit (GPU) and MapReduce. With the power of the proposed framework, various key algorithms in the field of computer vision, such as K-Means clustering, bag of features, inverted file index with hamming embedding and weak geometric consistency, are applied to NDV retrieval. Experimental results on the benchmark CC_WEB_VIDEO NDV dataset demonstrate that the proposed framework can significantly speed up processing huge amounts of video repositories.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An image-based near-duplicate video retrieval and localization using improved Edit distance

Article 02 December 2016

Hao Liu, Qingjie Zhao, … Yanming Chen

Advance on large scale near-duplicate video retrieval

Article 03 January 2020

Ling Shen, Richang Hong & Yanbin Hao

A Computationally Efficient Algorithm for Large Scale Near-Duplicate Video Detection

References

Batko M, Falchi F, Lucchese C, Novak D, Perego R, Rabitti F, Sedmidubsky J, Zezula P (2010) Building a web-scale image similarity search system. Multimed Tools Appl 47 (3):599–629
Article Google Scholar
Cevahir A, Torii J (2012) GPU-enabled high performance online visual search with high accuracy. In: Proc. ISM’12, pp. 413–420
Dean J, Ghemawat S (2008) MapReduce: simplified data processing on large clusters. Communications of the ACM - 50th anniversary issue: 1958-2008 51(1), 107–113
Dong W, Wang Z, Charikar M, Li K (2008) Efficiently matching sets of features with random histograms. In:Proc. ACM MM’08, pp. 179–188
Douze M, Gaidon A, Jegou H, Marszałek M, Schmid C et al (2008) INRIA-LEARs video copy detection system. In:TRECVID Workshop’08
Flickr100k image dataset Online available: http://www.robots.ox.ac.uk/~vgg/data/oxbuildings/flickr100k.html
Google protobuf Online available: http://code.google.com/p/protobuf/
Hadoop Online available: http://hadoop.apache.org/docs/r1.2.1/
He B, Fang W, Luo Q, Govindaraju NK, Wang NT (2008) Mars: A MapReduce framework on graphics processors. In:Proc. PACT’08, pp. 260–269
Hua XS, Chen X, Zhang HJ (2004) Robust video signature based on ordinal measure. In:Proc. ICIP’04, pp. 685–688
Jegou H, Douze M, Schmid C (2008) Hamming embedding and weak geometric consistency for large scale image search. In:Proc. ECCV’08, pp. 304–317
Jegou H, Douze M, Schmid C (2010) Improving bag-of-features for large scale image search. IJCV 87(3):316–336
Article Google Scholar
Karantasis KI, Polychronopoulos ED, Dimitrakopoulos GN (2010) Accelerating data clustering on GPU-based clusters under shared memory abstraction. In: Proc. CCWP’10, pp. 1–5
Li Y, Crandall DJ, Huttenlocher DP (2009) Landmark classification in large-scale image collections. In: Proc. CVPR’09, pp. 1957–1964
Liu J, Huang Z, Cai H, Shen HT, Ngo CW, Wang W (2013) Near-duplicate video retrieval: current research and future trends. ACM computing surveys 45(4). Article:44
Lowe D (2004) Distinctive image features from scale-invariant keypoints. IJCV 60(2):91–110
Article Google Scholar
Mikolajczyk K, Schmid C (2004) Scale and affine invariant interest point detectors. IJCV 60(1):63–86
Article Google Scholar
Moise D, Shestakov D, Gudmundsson G, Amsaleg L (2013) Indexing and searching 100M images with Map-Reduce. In: Proc. ICMR’13, pp. 17–24
NVIDIA CUDA Online available: https://developer.nvidia.com/
Owens JD, Houston M, Luebke D, Green S, Stone JE, Phillips JC (2008) GPU computing. Proc IEEE 96(5):879–899
Article Google Scholar
Shalom SA, Dash M, Tue M (2008) Efficient K-Means clustering using accelerated graphics processors. In: Proc. DAWAK’08, pp. 166–175
Shang L, Yang L, Wang F, Chan KP, Hua XS (2010) Real-time large scale near-duplicate web video retrieval. In: Proc. ACM MM’10, pp. 531–540
Sivic J, Zisserman A (2003) Video google: A text retrieval approach to object matching in videos. In: Proc. ICCV’03, pp. 1470–1477
Song J, Yang Y, Huang Z, Shen HT, Hong R (2011) Multiple feature hashing for real-time large scale near-duplicate video retrieval. In: Proc. ACM MM’11, pp. 423–432
Stuart JA, Owens JD (2011) Multi-GPU MapReduce on GPU clusters. In: Proc. IPDPS’11, pp. 1068–1079
Toolkit of Hessian-Affine detector Online available: http://www.robots.ox.ac.uk/~vgg/research/affine/
Van De Sande KEA, Gevers T, Snoek CGM (2011) Empowering visual categorization with the GPU. IEEE Trans Multimedia 13(1):60–70
Article Google Scholar
Wang H, Shen Y, Wang L, Zhu F, Wang W, Cheng C (2012) Large-scale multimedia data mining using MapReduce framework. In: Proc. CloudCom’12, pp. 287–292
White B, Yeh T, Lin J, Davis L (2010) Web-scale computer vision using MapReduce for multimedia data mining. In: Proc. MDMKDD’10, p. Article No.9
White T (2010) Hadoop: the definitive guide, 2nd edn. O’Reilly Media, Inc
Wu X, Ngo CW, Hauptmann AG, Tan HK (2009) Real-time near-duplicate elimination for web video search with content and context. IEEE Trans Multimedia 11(2):196–207
Article Google Scholar
Yan R, Fleury MO, Merler M, Natsev A, Smith JR (2009) Large-scale multimedia semantic concept modeling using robust subspace bagging and MapReduce. In: Proc. LS-MMRM’09, pp. 35–42
Zhao WL, Wu X, Ngo CW (2010) On the annotation of web videos by efficient near-duplicate search. IEEE Trans Multimedia 12(5):448–461
Article Google Scholar

Download references

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China under Grant 61102059, the “Shu Guang” project of Shanghai Municipal Education Commission and Shanghai Education Development Foundation under Grant 12SG23, the Program for Professor of Special Appointment (Eastern Scholar) at Shanghai Institutions of Higher Learning, the Fundamental Research Funds for the Central Universities under Grants 0800219158, 0800219270, and 1700219104, and the National Basic Research Program (973 Program) of China under Grant 2010CB328101.

Author information

Authors and Affiliations

Department of Computer Science & Technology and Key Laboratory of Embedded System and Service Computing, Ministry of Education, Tongji University, Shanghai, 200092, People’s Republic of China
Hanli Wang, Fengkuangtian Zhu, Bo Xiao & Lei Wang
School of Computer Science, Fudan University, Shanghai, 201203, People’s Republic of China
Yu-Gang Jiang

Authors

Hanli Wang
View author publications
You can also search for this author in PubMed Google Scholar
Fengkuangtian Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Bo Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Lei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yu-Gang Jiang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hanli Wang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, H., Zhu, F., Xiao, B. et al. GPU-based MapReduce for large-scale near-duplicate video retrieval. Multimed Tools Appl 74, 10515–10534 (2015). https://doi.org/10.1007/s11042-014-2185-x

Download citation

Received: 07 February 2014
Revised: 06 July 2014
Accepted: 07 July 2014
Published: 01 August 2014
Issue Date: December 2015
DOI: https://doi.org/10.1007/s11042-014-2185-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

GPU-based MapReduce for large-scale near-duplicate video retrieval

Abstract

Access this article

Similar content being viewed by others

An image-based near-duplicate video retrieval and localization using improved Edit distance

Advance on large scale near-duplicate video retrieval

A Computationally Efficient Algorithm for Large Scale Near-Duplicate Video Detection

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

GPU-based MapReduce for large-scale near-duplicate video retrieval

Abstract

Access this article

Similar content being viewed by others

An image-based near-duplicate video retrieval and localization using improved Edit distance

Advance on large scale near-duplicate video retrieval

A Computationally Efficient Algorithm for Large Scale Near-Duplicate Video Detection

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation