Abstract
This work presents a neural network for the retrieval of images from text queries. The proposed network is composed of two main modules: the first one extracts a global picture representation from local block descriptors while the second one aims at solving the retrieval problem from the extracted representation. Both modules are trained jointly to minimize a loss related to the retrieval performance. This approach is shown to be advantageous when compared to previous models relying on unsupervised feature extraction: average precision over Corel queries reaches 26.2% for our model, which should be compared to 21.6% for PAMIR, the best alternative.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Barnard, K., Duygulu, P., Forsyth, D., de Freitas, N., Blei, D.M., Jordan, M.I.: Matching words and pictures. J. of Machine Learning Research 3 (2003)
Grangier, D., Bengio, S.: A discriminative approach for the retrieval of images from text queries. Technical report, IDIAP Research Institute (2006)
Jeon, J., Lavrenko, V., Manmatha, R.: Automatic image annotation and retrieval using cross-media relevance models. In: ACM Special Interest Group on Information Retrieval (2003)
Monay, F., Gatica-Perez, D.: PLSA-based image auto-annotation: constraining the latent space. In: ACM Multimedia (2004)
Duygulu, P., Barnard, K., de Freitas, N., Forsyth, D.: Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In: European Conf. on Computer Vision (2002)
Tieu, K., Viola, P.: Boosting image retrieval. Intl. J. of Computer Vision 56 (2004)
Wu, H., LuE, H., Ma, S.: A practical SVM-based algorithm for ordinal regression in image retrieval. In: ACM Multimedia (2003)
LeCun, Y., Boser, B., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W., Jackel, L.D.: Handwritten digit recognition with a back-propagation network. In: Conf. on Advances in Neural Information Processing Systems (1989)
Garcia, C., Delakis, M.: Convolutional face finder: A neural architecture for fast and robust face detection. T. on Pattern Analysis and Machine Intelligence 26 (2004)
Joachims, T.: Optimizing search engines using clickthrough data. In: Intl. Conf. on Knowledge Discovery and Data Mining (2002)
Grangier, D., Bengio, S.: Exploiting hyperlinks to learn a retrieval model. In: NIPS Workshop on Learning to Rank (2005)
Quelhas, P., Monay, F., Odobez, J.M., Gatica-Perez, D., Tuytelaars, T., Gool, L.J.V.: Modeling scenes with local descriptors and latent aspects. In: Intl. Conf. on Computer Vision (2005)
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison Wesley, Harlow (1999)
LeCun, Y., Bottou, L., Orr, G.B., Mueller, K.R.: Efficient backprop. In: Orr, G.B., Mueller, K.R. (eds.) Neural Networks: Trick of the Trade, Springer, Heidelberg (1998)
Takala, V., Ahonen, T., Pietikainen, M.: Block-based methods for image retrieval using local binary patterns. In: Scandinavian Conf. on Image Analysis (2005)
Burges, C., Shaked, T., Renshaw, E., Lazier, A., Deeds, M., Hamilton, N., Hullender, G.: Learning to rank using gradient descent. In: Intl. Conf. on Machine Learning (2005)
Rice, J.: Rice, Mathematical Statistics and Data Analysis. Duxbury Press, Belmont (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Grangier, D., Bengio, S. (2006). A Neural Network to Retrieve Images from Text Queries. In: Kollias, S., Stafylopatis, A., Duch, W., Oja, E. (eds) Artificial Neural Networks – ICANN 2006. ICANN 2006. Lecture Notes in Computer Science, vol 4132. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11840930_3
Download citation
DOI: https://doi.org/10.1007/11840930_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-38871-5
Online ISBN: 978-3-540-38873-9
eBook Packages: Computer ScienceComputer Science (R0)