Skip to main content

Aggregation-Based Probing for Large-Scale Duplicate Image Detection

  • Conference paper
  • 4555 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7808))

Abstract

Identifying visually duplicate images is a prerequisite for a broad range of tasks in image retrieval and mining, thus attracts heavy research interests. Many efficient and precise algorithms are proposed. However, compared to the performance duplicate text detection, the recall for duplicate image detection is relatively low, which means that many duplicate images are left undetected. In this paper, we focus on improving recall while preserving high precision. We exploit hash code representation of images and present a probing based algorithm to increase the recall. Different from state-of-the-art probing methods in image search, multiple probing sequences exist in duplicate image detection task. To merge multiple probing sequences, we design an unsupervised score-based aggregation algorithm. The experimental results on a large scale data set show that precision is preserved and the recall is increased. Furthermore, our algorithm on aggregating multiple probing sequences is proved to be stable.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Burges, C.J.C., Ragno, R., Le, Q.V.: Learning to rank with nonsmooth cost functions. In: NIPS, pp. 193–200 (2006)

    Google Scholar 

  2. Chen, S., Wang, F., Song, Y., Zhang, C.: Semi-supervised ranking aggregation. In: CIKM, pp. 1427–1428 (2008)

    Google Scholar 

  3. Chum, O., Philbin, J., Zisserman, A.: Near duplicate image detection: min-hash and tf-idf weighting. In: BMVC (2008)

    Google Scholar 

  4. Dwork, C., Kumar, R., Naor, M., Sivakumar, D.: Rank aggregation methods for the web. In: WWW, pp. 613–622 (2001)

    Google Scholar 

  5. Fagin, R., Kumar, R., Sivakumar, D.: Efficient similarity search and classification via rank aggregation. In: SIGMOD Conference, pp. 301–312 (2003)

    Google Scholar 

  6. Huang, Z., Shen, H.T., Shao, J., Zhou, X., Cui, B.: Bounded coordinate system indexing for real-time video clip search. ACM Trans. Inf. Syst., 27(3) (2009)

    Google Scholar 

  7. Jurman, G., Riccadonna, S., Visintainer, R., Furlanello, C.: Canberra distance on ranked lists. In: Ranking NIPS 2009 Workshop, pp. 22–27 (2009)

    Google Scholar 

  8. Klementiev, A., Roth, D., Small, K.: An unsupervised learning algorithm for rank aggregation. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 616–623. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  9. Lee, D.C., Ke, Q., Isard, M.: Partition min-hash for partial duplicate image discovery. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 648–662. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  10. Li, Y., Jin, J., Zhou, X.: Video matching using binary signature. In: Intelligent Signal Processing and Communication Systems, pp. 317–320 (December 2005)

    Google Scholar 

  11. Liu, Y., Liu, T.-Y., Qin, T., Ma, Z., Li, H.: Supervised rank aggregation. In: WWW, pp. 481–490 (2007)

    Google Scholar 

  12. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2), 91–110 (2004)

    Article  Google Scholar 

  13. Lv, Q., Josephson, W., Wang, Z., Charikar, M., Li, K.: Multi-probe lsh: Efficient indexing for high-dimensional similarity search. In: VLDB, pp. 950–961 (2007)

    Google Scholar 

  14. Pönitz, T., Stöttinger, J.: Efficient and robust near-duplicate detection in large and growing image data-sets. In: ACM Multimedia, pp. 1517–1518 (2010)

    Google Scholar 

  15. Qamra, A., Meng, Y., Chang, E.Y.: Enhanced perceptual distance functions and indexing for image replica recognition. IEEE Trans. Pattern Anal. Mach. Intell. 27(3), 379–391 (2005)

    Article  Google Scholar 

  16. Valle, E., Cord, M., Philipp-Foliguet, S.: High-dimensional descriptor indexing for large multimedia databases. In: CIKM, pp. 739–748 (2008)

    Google Scholar 

  17. Wang, B., Li, Z., Li, M., Ma, W.-Y.: Large-scale duplicate detection for web image search. In: ICME, pp. 353–356 (2006)

    Google Scholar 

  18. Wang, X.-J., Zhang, L., Liu, M., Li, Y., Ma, W.-Y.: Arista - image search to annotation on billions of web photos. In: CVPR, pp. 2987–2994 (2010)

    Google Scholar 

  19. Wang, Y., Hou, Z., Leman, K.: Keypoint-based near-duplicate images detection using affine invariant feature and color matching. In: ICASSP, pp. 1209–1212 (2011)

    Google Scholar 

  20. Zhang, D., Chang, S.-F.: Detecting image near-duplicate by stochastic attributed relational graph matching with learning. In: ACM Multimedia, pp. 877–884 (2004)

    Google Scholar 

  21. Zhao, X., Li, G., Wang, M., Yuan, J., Zha, Z.-J., Li, Z., Chua, T.-S.: Integrating rich information for video recommendation with multi-task rank aggregation. In: ACM Multimedia, pp. 1521–1524 (2011)

    Google Scholar 

  22. Zhou, W., Lu, Y., Li, H., Song, Y., Tian, Q.: Spatial coding for large scale partial-duplicate web image search. In: ACM Multimedia, pp. 511–520 (2010)

    Google Scholar 

  23. Zhu, J., Hoi, S.C.H., Lyu, M.R., Yan, S.: Near-duplicate keyframe retrieval by nonrigid image matching. In: ACM Multimedia, pp. 41–50 (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Feng, Z., Chen, J., Wu, X., Yu, Y. (2013). Aggregation-Based Probing for Large-Scale Duplicate Image Detection. In: Ishikawa, Y., Li, J., Wang, W., Zhang, R., Zhang, W. (eds) Web Technologies and Applications. APWeb 2013. Lecture Notes in Computer Science, vol 7808. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37401-2_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-37401-2_42

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-37400-5

  • Online ISBN: 978-3-642-37401-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics