Skip to main content

Distributed Image Retrieval Base on LSH Indexing on Spark

  • Conference paper
  • First Online:
Big Data and Security (ICBDS 2019)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1210))

Included in the following conference series:

Abstract

With the advent of the era of big data, how to process massive image, video and other multimedia data timely and accurately has become a new challenge in related fields. Aiming at the computational bottleneck and inefficiency of traditional image content retrieval system, in this paper, a distributed image retrieval framework base on Location Sensitive Hash (LSH) indexing is proposed on Spark, which combines with the distributed storage and computing characteristics of Spark big data platform. Then, distributed K-means based Bag of Visual Word (BoVW) algorithm and LSH algorithm are proposed to build LSH index vectors on Spark in parallel. The experiment shows that the retrieval time of our framework can be reduced significantly, and both retrieval recall and precision are high compared with the traditional retrieval method.

Supported by National Natural Science Foundation of China under Grants Nos. 41571389 and 61872191, TIADMP of Chongqing under Grant No. cstc2018jszx-cyztzxX0015.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-sensitive hashing scheme based on P-stable distributions. In: Proceedings of the Twentieth Annual Symposium on Computational Geometry, pp. 253–262. ACM (2004)

    Google Scholar 

  2. Gionis, A., Indyk, P., Motwani, R., et al.: Similarity search in high dimensions via hashing. In: Vldb, vol. 99, pp. 518–529 (1999)

    Google Scholar 

  3. Ren, R., Collomosse, J., Jose, J.: A BOVW based query generative model. In: Lee, K.-T., Tsai, W.-H., Liao, H.-Y.M., Chen, T., Hsieh, J.-W., Tseng, C.-C. (eds.) MMM 2011. LNCS, vol. 6523, pp. 118–128. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-17832-0_12

    Chapter  Google Scholar 

  4. Battiato, S., Farinella, G.M., Messina, E., Puglisi, G.: Understanding geometric manipulations of images through BOVW-based hashing. In: 2011 IEEE International Conference on Multimedia and Expo, pp. 1–6. IEEE (2011)

    Google Scholar 

  5. Kumar, M.D., Babaie, M., Zhu, S., Kalra, S., Tizhoosh, H.R.: A comparative study of CNN, BOVW and LBP for classification of histopathological images. In: 2017 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 1–7. IEEE (2017)

    Google Scholar 

  6. Alham, N.K., Li, M., Liu, Y., Hammoud, S.: A mapreduce-based distributed svm algorithm for automatic image annotation. Comput. Math. Appl. 62(7), 2801–2811 (2011)

    Article  Google Scholar 

  7. Yin, D., Liu, D.: Content-based image retrial based on Hadoop. Math. Probl. Eng. (2013)

    Google Scholar 

  8. Jai-Andaloussi, S., Elabdouli, A., Chaffai, A., Madrane, N., Sekkaki, A.: Medical content based image retrieval by using the Hadoop framework. In: ICT 2013, pp. 1–5. IEEE (2013)

    Google Scholar 

  9. Hare, J.S., Samangooei, S., Lewis, P.H.: Practical scalable image analysis and indexing using Hadoop. Multimed. Tools Appl. 71(3), 1215–1248 (2014)

    Article  Google Scholar 

  10. Mezzoudj, S., Seghir, R., Saadna, Y., et al.: A parallel content-based image retrieval system using spark and tachyon frameworks. J. King Saud Univ. Comput. Inf. Sci. (2019)

    Google Scholar 

  11. Zhang, W., Li, D., Xu, Y., Zhang, Y.: Shuffle-efficient distributed locality sensitive hashing on spark. In: 2016 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), pp. 766–767. IEEE (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiagao Wu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hou, Z., Huang, C., Wu, J., Liu, L. (2020). Distributed Image Retrieval Base on LSH Indexing on Spark. In: Tian, Y., Ma, T., Khan, M. (eds) Big Data and Security. ICBDS 2019. Communications in Computer and Information Science, vol 1210. Springer, Singapore. https://doi.org/10.1007/978-981-15-7530-3_33

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-7530-3_33

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-7529-7

  • Online ISBN: 978-981-15-7530-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics