Abstract
Content-based image retrieval is becoming a popular way for searching digital libraries as the amount of available multimedia data increases. However, the cost of developing from scratch a robust and reliable system with content-based image retrieval facilities for large databases is quite prohibitive.
In this paper, we propose to exploit an approach to perform approximate similarity search in metric spaces developed by [3,6]. The idea at the basis of these techniques is that when two objects are very close one to each other they ’see’ the world around them in the same way. Accordingly, we can use a measure of dissimilarity between the views of the world at different objects, in place of the distance function of the underlying metric space. To employ this idea the low level image features (such as colors and textures) are converted into a textual form and are indexed into the inverted index by means of the Lucene search engine library. The conversion of the features in textual form allows us to employ the Lucene’s off-the-shelf indexing and searching abilities with a little implementation effort. In this way, we are able to set up a robust information retrieval system that combines full-text search with content-based image retrieval capabilities.
This work was partially supported by the VISITO project, funded by the Tuscany region of Italy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Spearman’s rank correlation coefficient, http://en.wikipedia.org/wiki/Spearman’s_rank_correlation_coefficient
Amato, G., Rabitti, F., Savino, P., Zezula, P.: Region proximity in metric spaces and its use for approximate similarity search. ACM Trans. Inf. Syst. 21(2), 192–227 (2003)
Amato, G., Savino, P.: Approximate similarity search in metric spaces using inverted files. In: Proceedings of the 3rd International Conference on Scalable Information Systems (InfoScale 2008), pp. 1–10. ICST (2008)
Batko, M., Kohoutkova, P., Novak, D.: Cophir image collection under the microscope. In: International Workshop on Similarity Search and Applications, pp. 47–54 (2009)
Bolettieri, P., Esuli, A., Falchi, F., Lucchese, C., Perego, R., Rabitti, F.: Enabling content-based image retrieval in very large digital libraries. In: Second Workshop on Very Large Digital Libraries (VLDL 2009), DELOS, pp. 43–50 (2009)
Chavez, E., Figueroa, K., Navarro, G.: Effective proximity retrieval by ordering permutations. IEEE Transactions on Pattern Analysis and Machine Intelligence 30, 1647–1658 (2007)
Ciaccia, P., Patella, M.: Pac nearest neighbor queries: Approximate and controlled search in high-dimensional and metric spaces. In: ICDE, pp. 244–255 (2000)
Ciaccia, P., Patella, M., Zezula, P.: M-tree: An efficient access method for similarity search in metric spaces. In: Jarke, M., Carey, M.J., Dittrich, K.R., Lochovsky, F.H., Loucopoulos, P., Jeusfeld, M.A. (eds.) VLDB 1997, Proceedings of 23rd International Conference on Very Large Data Bases, Athens, Greece, August 25-29, pp. 426–435. Morgan Kaufmann, San Francisco (1997)
Egecioglu, Ö., Ferhatosmanoglu, H.: Dimensionality reduction and similarity computation by inner product approximations. In: Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM 2000), McLean, Virginia, USA, November 6-11, pp. 219–226. ACM Press, New York (2000)
Esuli, A.: Pp-index: Using permutation prefixes for efficient and scalable approximate similarity search. In: Proceedings of the 7th Workshop on Large-Scale Distributed Systems for Information Retrieval (LSDS-IR 2009), pp. 17–24 (2009)
Fagin, R., Kumar, R., Sivakumar, D.: Comparing top-k lists. SIAM J. of Discrete Math. 17(1), 134–160 (2003)
Faloutsos, C., Lin, K.-I.: FastMap: A fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets. In: Carey, M.J., Schneider, D.A. (eds.) Proceedings of the 18th ACM International Conference on Management of Data (SIGMOD 1995), San Jose, California, USA, May 22-25, pp. 163–174. ACM Press, New York (1995)
Lux, M., Chatzichristofis, S.A.: Lire: lucene image retrieval: an extensible java cbir library. In: MM 2008: Proceeding of the 16th ACM International Conference on Multimedia, pp. 1085–1088. ACM, New York (2008)
Ogras, Ü.Y., Ferhatosmanoglu, H.: Dimensionality reduction using magnitude and shape approximations. In: Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM 2003), New Orleans, Louisiana, USA, November 3-8, pp. 99–107. ACM Press, New York (2003)
Pramanik, S., Alexander, S., Li, J.: An efficient searching algorithm for approximate nearest neighbor queries in high dimensions. In: Proceedings of the IEEE International Conference on Multimedia Computing and Systems (ICMCS 1999), Florence, Italy, June 7-11, vol. 1. IEEE Computer Society Press, Los Alamitos (1999)
Pramanik, S., Li, J., Ruan, J., Bhattacharjee, S.K.: Efficient search scheme for very large image databases. In: Beretta, G.B., Schettini, R. (eds.) Proceedings of the International Society for Optical Engineering (SPIE) on Internet Imaging, San Jose, California, USA, January 26, vol. 3964, pp. 79–90. The International Society for Optical Engineering (December 1999)
Squire, D.M., Müller, W., Müller, H., Pun, T.: Content-based query of image databases: inspirations from text retrieval. Pattern Recognition Letters 21(13-14), 1193–1198 (2000); Selected Papers from The 11th Scandinavian Conference on Image
Wang, X., Wang, J.T.-L., Lin, K.-I., Shasha, D., Shapiro, B.A., Zhang, K.: An index structure for data mining and clustering. In: Knowledge and Information Systems, vol. 2, pp. 161–184. Springer, Heidelberg (2000)
Weber, R., Böhm, K.: Trading quality for time with nearest neighbor search. In: Zaniolo, C., Grust, T., Scholl, M.H., Lockemann, P.C. (eds.) EDBT 2000. LNCS, vol. 1777, p. 21. Springer, Heidelberg (2000)
Zezula, P., Amato, G., Dohnal, V., Batko, M.: Similarity Search - The Metric Space Approach. In: Advances in Database Systems, vol. 32. Springer, Heidelberg (2006)
Zezula, P., Savino, P., Amato, G., Rabitti, F.: Approximate similarity retrieval with m-trees. VLDB J 7(4), 275–293 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gennaro, C., Amato, G., Bolettieri, P., Savino, P. (2010). An Approach to Content-Based Image Retrieval Based on the Lucene Search Engine Library. In: Lalmas, M., Jose, J., Rauber, A., Sebastiani, F., Frommholz, I. (eds) Research and Advanced Technology for Digital Libraries. ECDL 2010. Lecture Notes in Computer Science, vol 6273. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15464-5_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-15464-5_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15463-8
Online ISBN: 978-3-642-15464-5
eBook Packages: Computer ScienceComputer Science (R0)