skip to main content
10.1145/3372278.3390731acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
short-paper

Imageability Estimation using Visual and Language Features

Published: 08 June 2020 Publication History

Abstract

Imageability is a concept from Psycholinguistics quantizing the human perception of words. However, existing datasets are created through subjective experiments and are thus very small. Therefore, methods to automatically estimate the imageability can be helpful. For an accurate automatic imageability estimation, we extend the idea of a psychological hypothesis called Dual-Coding Theory, that discusses the connection of our perception towards visual information and language information, and also focus on the relationship between the pronunciation of a word and its imageability. In this research, we propose a method to estimate imageability of words using both visual and language features extracted from corresponding data. For the estimation, we use visual features extracted from low- and high-level image features, and language features extracted from textual features and phonetic features of words. Evaluations show that our proposed method can estimate imageability more accurately than comparative methods, implying the contribution of each feature to the imageability.

References

[1]
Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist., Vol. 5 (6 2017), 135--146.
[2]
Michael J Cortese and April Fugett. 2004. Imageability ratings for 3,000 monosyllabic words. Beh. Res. Methods Instrum. Comput., Vol. 36, 3 (8 2004), 384--387.
[3]
Marc A. Kastner, Ichiro Ide, Frank Nack, Yasutomo Kawanishi, Takatsugu Hirayama, Daisuke Deguchi, and Hiroshi Murase. 2020. Estimating the imageability of words by mining visual characteristics from crawled image data. Multimed. Tools Appl. (2 2020). https://doi.org/10.1007/s11042-019-08571--4
[4]
Nikola Ljubevs ić, Darja Fivs er, and Anita Peti-Stantić. 2018. Predicting concreteness and imageability of words within and across languages via word embeddings. In Proc. 56th Annu. Meet. Assoc. Comput. Linguist. 217--222.
[5]
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013a. Efficient estimation of word representations in vector space. Comput. Res. Repos., arXiv: 1301.3781.
[6]
Tomas Mikolov, Edouard Grave, Piotr Bojanowski, Christian Puhrsch, and Armand Joulin. 2018. Advances in Pre-Training Distributed Word Representations. In Proc. Int. Conf. on Language Resources and Evaluation 2018. 52--55.
[7]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013b. Distributed representations of words and phrases and their compositionality. In Proc. 26th Int. Conf. Neural Inf. Process. Syst., Vol. 2. 3111--3119.
[8]
Allan Paivio. 1990. Mental representations: A dual coding approach. Vol. 9. Oxford University Press, Oxford, England.
[9]
Allan Paivio, John C. Yuille, and Stephen A. Madigan. 1968. Concreteness, imagery, and meaningfulness values for 925 nouns. J. Exp. Psychol., Vol. 76, 1 (1 1968), 1--25.
[10]
Joseph Redmon and Ali Farhadi. 2017. YOLO9000: Better, faster, stronger. In Proc. 2017 IEEE Conf. Comput. Vis. Pattern Recognit. 6517--6525.
[11]
Jamie Reilly and Jacob Kean. 2007. Formal distinctiveness of high-and low-imageability nouns: Analyses and theoretical implications. J. Cogn. Sci., Vol. 31, 1 (2 2007), 157--168.
[12]
Bart Thomee, David A Shamma, Gerald Friedland, Benjamin Elizalde, Karl Ni, Douglas Poland, Damian Borth, and Li-Jia Li. 2016. YFCC100M: The new data in multimedia research. Comm. ACM, Vol. 59, 2 (2 2016), 64--73.
[13]
Mingda Zhang, Rebecca Hwa, and Adriana Kovashka. 2018. Equal but not the same: Understanding the implicit relationship between persuasive images and text. In Proc. 2018 Br. Mach. Vis. Conf. 14p.

Cited By

View all
  • (2025)Using Machine Learning to Explain Paraphasias in Narratives of People With AphasiaPerspectives of the ASHA Special Interest Groups10.1044/2024_PERSP-23-00291(1-12)Online publication date: 3-Mar-2025
  • (2021)A multi-modal dataset for analyzing the imageability of concepts across modalities2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)10.1109/MIPR51284.2021.00039(213-218)Online publication date: Sep-2021
  • (2021)Imageability- and Length-Controllable Image CaptioningIEEE Access10.1109/ACCESS.2021.31313939(162951-162961)Online publication date: 2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICMR '20: Proceedings of the 2020 International Conference on Multimedia Retrieval
June 2020
605 pages
ISBN:9781450370875
DOI:10.1145/3372278
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 June 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. computational psycholinguistics
  2. language and vision
  3. multimedia modeling

Qualifiers

  • Short-paper

Funding Sources

  • JSPS

Conference

ICMR '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 254 of 830 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)9
  • Downloads (Last 6 weeks)0
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Using Machine Learning to Explain Paraphasias in Narratives of People With AphasiaPerspectives of the ASHA Special Interest Groups10.1044/2024_PERSP-23-00291(1-12)Online publication date: 3-Mar-2025
  • (2021)A multi-modal dataset for analyzing the imageability of concepts across modalities2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)10.1109/MIPR51284.2021.00039(213-218)Online publication date: Sep-2021
  • (2021)Imageability- and Length-Controllable Image CaptioningIEEE Access10.1109/ACCESS.2021.31313939(162951-162961)Online publication date: 2021
  • (2021)Tell as You Imagine: Sentence Imageability-Aware Image CaptioningMultiMedia Modeling10.1007/978-3-030-67835-7_6(62-73)Online publication date: 22-Jun-2021

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media