Integrating Textual and Visual Information for Cross-Language Image Retrieval

Lin, Wen-Cheng; Chang, Yih-Chen; Chen, Hsin-Hsi

doi:10.1007/11562382_35

Wen-Cheng Lin²⁰,
Yih-Chen Chang²⁰ &
Hsin-Hsi Chen²⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3689))

Included in the following conference series:

Asia Information Retrieval Symposium

1036 Accesses
3 Citations

Abstract

This paper explores the integration of textual and visual information for cross-language image retrieval. An approach which automatically transforms textual queries into visual representations is proposed. The relationships between text and images are mined. We employ the mined relationships to construct visual queries from textual ones. The retrieval results of textual and visual queries are combined. We conduct English monolingual and Chinese-English cross-language retrieval experiments to evaluate the proposed approach. The selection of suitable textual query terms to construct visual queries is the major concern. Experimental results show that the proposed approach improves retrieval performance, and nouns are appropriate to generate visual queries.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Relational Visual-Textual Information Retrieval

Consensus-Aware Visual-Semantic Embedding for Image-Text Matching

Explaining Multimodal Image Retrieval Using A Vision and Language Task Model

References

Besançon, R., Hède, P., Moellic, P.A., Fluhr, C.: LIC2M Experiments at ImageCLEF 2004. In: Working Notes for the CLEF 2004 Workshop, pp. 555–560 (2004)
Google Scholar
Brill, E.: Transformation-Based Error-Driven Learning and Natural Language Processing: A Case Study in Part of Speech Tagging. Computational Linguistics 21(4), 543–565 (1995)
Google Scholar
Carson, C., Belongie, S., Greenspan, H., Malik, J.: Blobworld: Image Segmentation Using Expectation-Maximization and Its Application to Image Querying. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(8), 1026–1038 (2002)
Article Google Scholar
Chen, H.H., Ding, Y.W., Tsai, S.C., Bian, G.W.: Description of the NTU System Used for MET2. In: Proceedings of Seventh Message Understanding Conference (1998)
Google Scholar
Chen, H.H., Yang, C., Lin, Y.: Learning Formulation and Transformation Rules for Multilingual Named Entities. In: Proceedings of ACL 2003 Workshop on Multilingual and Mixed-language Named Entity Recognition: Combining Statistical and Symbolic Models, pp. 1–8. Association for Computational Linguistics (2003)
Google Scholar
Clough, P., Sanderson, M., Müller, H.: The CLEF Cross Language Image Retrieval Track (ImageCLEF) 2004. In: Peters, C., Clough, P., Gonzalo, J., Jones, G.J.F., Kluck, M., Magnini, B. (eds.) CLEF 2004. LNCS, vol. 3491, pp. 459–473. Springer, Heidelberg (2005)
Google Scholar
Duygulu, P., Barnard, K., Freitas, N., Forsyth, D.: Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary. In: Proceedings of Seventh European Conference on Computer Vision, vol. 4, pp. 97–112 (2002)
Google Scholar
Goodrum, A.A.: Image Information Retrieval: An Overview of Current Research. Information Science 3(2), 63–66 (2000)
Google Scholar
Jeon, J., Lavrenko, V., Manmatha, R.: Automatic Image Annotation and Retrieval using Cross-Media Relevance Models. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2003), pp. 119–126. ACM Press, New York (2003)
Chapter Google Scholar
Jones, G.J.F., Groves, D., Khasin, A., Lam-Adesina, A., Mellebeek, B., Way, A.: Dublin City University at CLEF 2004: Experiments with the ImageCLEF St Andrew’s Collection. In: Peters, C., Clough, P., Gonzalo, J., Jones, G.J.F., Kluck, M., Magnini, B. (eds.) CLEF 2004. LNCS, vol. 3491, pp. 511–515. Springer, Heidelberg (2005)
Google Scholar
Lavrenko, V., Manmatha, R., Jeon, J.: A Model for Learning the Semantics of Pictures. In: Proceedings of the Seventeenth Annual Conference on Neural Information Processing Systems (2003)
Google Scholar
Lin, W.C., Chang, Y.C., Chen, H.H.: From Text to Image: Generating Visual Query for Image Retrieval. In: Peters, C., Clough, P., Gonzalo, J., Jones, G.J.F., Kluck, M., Magnini, B. (eds.) CLEF 2004. LNCS, vol. 3491, pp. 517–524. Springer, Heidelberg (2005)
Google Scholar
Lin, W.H., Chen, H.H.: Backward Machine Transliteration by Learning Phonetic Similarity. In: Proceedings of Sixth Conference on Natural Language Learning, pp. 139–145. Association for Computational Linguistics (2002)
Google Scholar
Mori, Y., Takahashi, H., Oka, R.: Image-to-Word Transformation Based on Dividing and Vector Quantizing Images with Words. In: Proceedings of the First International Workshop on Multimedia Intelligent Storage and Retrieval Management (1999)
Google Scholar
Robertson, S.E., Walker, S., Beaulieu, M.: Okapi at TREC-7: Automatic Ad Hoc, Filtering, VLC and Interactive. In: Proceedings of the Seventh Text Retrieval Conference (TREC-7), pp. 253–264. National Institute of Standards and Technology (1998)
Google Scholar
The Lowlands Team: Lazy Users and Automatic Video Retrieval Tools in (the) Lowlands. In: Proceedings of the Tenth Text Retrieval Conference (TREC 2001), pp. 159–168. National Institute of Standards and Technology (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan
Wen-Cheng Lin, Yih-Chen Chang & Hsin-Hsi Chen

Authors

Wen-Cheng Lin
View author publications
You can also search for this author in PubMed Google Scholar
Yih-Chen Chang
View author publications
You can also search for this author in PubMed Google Scholar
Hsin-Hsi Chen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, Pohang University of Science and Technology, San 31, Hyoja-dong, Nam-gu, 790-784, Pohang, Korea
Gary Geunbae Lee
Computer and Communication Media Research, NEC Corp., Miyazaki 4-1-1, Miyamae-ku, 216-8555, Kawasaki, Japan
Akio Yamada
Human-Computer Communications Laboratory, Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Hong Kong
Helen Meng
School of Engineering, Information and Communications University, 119, Munjiro, Yuseong-gu, 305-732, Daejeon, Korea
Sung Hyon Myaeng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lin, WC., Chang, YC., Chen, HH. (2005). Integrating Textual and Visual Information for Cross-Language Image Retrieval. In: Lee, G.G., Yamada, A., Meng, H., Myaeng, S.H. (eds) Information Retrieval Technology. AIRS 2005. Lecture Notes in Computer Science, vol 3689. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11562382_35

Download citation

DOI: https://doi.org/10.1007/11562382_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29186-2
Online ISBN: 978-3-540-32001-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics