Abstract
With the advent of the era of big data, how to process massive image, video and other multimedia data timely and accurately has become a new challenge in related fields. Aiming at the computational bottleneck and inefficiency of traditional image content retrieval system, in this paper, a distributed image retrieval framework base on Location Sensitive Hash (LSH) indexing is proposed on Spark, which combines with the distributed storage and computing characteristics of Spark big data platform. Then, distributed K-means based Bag of Visual Word (BoVW) algorithm and LSH algorithm are proposed to build LSH index vectors on Spark in parallel. The experiment shows that the retrieval time of our framework can be reduced significantly, and both retrieval recall and precision are high compared with the traditional retrieval method.
Supported by National Natural Science Foundation of China under Grants Nos. 41571389 and 61872191, TIADMP of Chongqing under Grant No. cstc2018jszx-cyztzxX0015.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-sensitive hashing scheme based on P-stable distributions. In: Proceedings of the Twentieth Annual Symposium on Computational Geometry, pp. 253–262. ACM (2004)
Gionis, A., Indyk, P., Motwani, R., et al.: Similarity search in high dimensions via hashing. In: Vldb, vol. 99, pp. 518–529 (1999)
Ren, R., Collomosse, J., Jose, J.: A BOVW based query generative model. In: Lee, K.-T., Tsai, W.-H., Liao, H.-Y.M., Chen, T., Hsieh, J.-W., Tseng, C.-C. (eds.) MMM 2011. LNCS, vol. 6523, pp. 118–128. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-17832-0_12
Battiato, S., Farinella, G.M., Messina, E., Puglisi, G.: Understanding geometric manipulations of images through BOVW-based hashing. In: 2011 IEEE International Conference on Multimedia and Expo, pp. 1–6. IEEE (2011)
Kumar, M.D., Babaie, M., Zhu, S., Kalra, S., Tizhoosh, H.R.: A comparative study of CNN, BOVW and LBP for classification of histopathological images. In: 2017 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 1–7. IEEE (2017)
Alham, N.K., Li, M., Liu, Y., Hammoud, S.: A mapreduce-based distributed svm algorithm for automatic image annotation. Comput. Math. Appl. 62(7), 2801–2811 (2011)
Yin, D., Liu, D.: Content-based image retrial based on Hadoop. Math. Probl. Eng. (2013)
Jai-Andaloussi, S., Elabdouli, A., Chaffai, A., Madrane, N., Sekkaki, A.: Medical content based image retrieval by using the Hadoop framework. In: ICT 2013, pp. 1–5. IEEE (2013)
Hare, J.S., Samangooei, S., Lewis, P.H.: Practical scalable image analysis and indexing using Hadoop. Multimed. Tools Appl. 71(3), 1215–1248 (2014)
Mezzoudj, S., Seghir, R., Saadna, Y., et al.: A parallel content-based image retrieval system using spark and tachyon frameworks. J. King Saud Univ. Comput. Inf. Sci. (2019)
Zhang, W., Li, D., Xu, Y., Zhang, Y.: Shuffle-efficient distributed locality sensitive hashing on spark. In: 2016 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), pp. 766–767. IEEE (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Hou, Z., Huang, C., Wu, J., Liu, L. (2020). Distributed Image Retrieval Base on LSH Indexing on Spark. In: Tian, Y., Ma, T., Khan, M. (eds) Big Data and Security. ICBDS 2019. Communications in Computer and Information Science, vol 1210. Springer, Singapore. https://doi.org/10.1007/978-981-15-7530-3_33
Download citation
DOI: https://doi.org/10.1007/978-981-15-7530-3_33
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-7529-7
Online ISBN: 978-981-15-7530-3
eBook Packages: Computer ScienceComputer Science (R0)