skip to main content
10.1145/3343031.3352589acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

On Quantizing the Mental Image of Concepts for Visual Semantic Analyses

Published: 15 October 2019 Publication History

Abstract

With the rise of multi-modal applications, the need for better understanding of the relationship between language and vision becomes prominent. While modern applications often consider both text and image, human perception is often only of secondary consideration. In my doctoral studies, I research the quantization of visual differences between concepts regarding human perception. Initially, I looked at local visual differences between concepts and their subordinate concepts, measuring the variety gap between images of, e.g. car and vehicle. In the following study, I applied data-mining on Web-crawled images to estimate psycholinguistics metrics like the imageability of words. In this way, the tendency of low- vs. high-imageability can be estimated on a dictionary-level, defining the gap between words like peace and car. Going forward, I want to create visualization demos to analyze psycholinguistic relationships in image datasets.

References

[1]
M. J. Cortese and A. Fugett. 2004. Imageability ratings for 3,000 monosyllabic words. Behav Res Methods Instrum Comput, Vol. 36, 3 (August 2004), 384--387. https://doi.org/10.3758/BF03195585
[2]
J. D. J. Deng, W. D. W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In Proc. 2009 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition. 2--9. https://doi.org/10.1109/CVPR.2009.5206848
[3]
M. A. Kastner, I. Ide, Y. Kawanishi, T. Hirayama, D. Deguchi, and H. Murase. 2018. On understanding visual relationships of concepts by visualizing bag-of-visual-words models. In Poster at 21st Meeting on Image Recognition and Understanding, Japan .
[4]
M. A. Kastner, I. Ide, Y. Kawanishi, T. Hirayama, D. Deguchi, and H. Murase. 2019 a. A preliminary study on estimating word imageability labels using Web image data mining. In Proc. 25th Annual Meeting of The Association for Natural Language Processing, Japan. 747--750.
[5]
M. A. Kastner, I. Ide, Y. Kawanishi, T. Hirayama, D. Deguchi, and H. Murase. 2019 b. Estimating the visual variety of concepts by referring to Web popularity. Multimed Tools Appl, Vol. 78, 7 (01 Apr 2019), 9463--9488. https://doi.org/10.1007/s11042-018--6528-x
[6]
M. A. Kastner, I. Ide, F. Nack, Y. Kawanishi, T. Hirayama, D. Deguchi, and H. Murase. 2019 c. Estimating the imageability of words by mining visual characteristics from crawled image data. Multimed Tools Appl (Under review 2019).
[7]
G. A. Miller. 1995. WordNet: A lexical database for English . Comm. ACM, Vol. 38, 11 (1995), 39--41. https://doi.org/10.1145/219717.219748
[8]
T. Morris, M. Spittle, and A. P. Watt. 2005. Imagery in sport .Human Kinetics.
[9]
A. Paivio, J. C. Yuille, and S. A. Madigan. 1968. Concreteness, imagery, and meaningfulness values for 925 nouns . J Exp Psychol, Vol. 76, 1 (1968), 1--25.
[10]
J. Reilly and J. Kean. 2010. Formal distinctiveness of high- and low-imageability nouns: Analyses and theoretical implications. J Cogn Sci, Vol. 31, 1 (February 2010), 157--168. https://doi.org/10.1080/03640210709336988
[11]
A. Rofes, L. Zakariás, K. Ceder, M. Lind, M. B. Johansson, V. de Aguiar, J. Bjekić, V. Fyndanis, A. Gavarró, H. G. Simonsen, C. H. Sacristán, M. Kambanaros, J. K. Kraljević, S. Martínez-Ferreiro, .I. Mavis, C. M. Orellana, I. Sör, Á. Lukács, M. Tuncc er, J. Vuksanović, A. M. Ibarrola, M. Pourquie, S. Varlokosta, and D. Howard. 2018. Imageability ratings across languages. Behavior Research Methods, Vol. 50, 3 (Jun 2018), 1187--1197. https://doi.org/10.3758/s13428-017-0936-0
[12]
B. Thomee, D. A. Shamma, G. Friedland, B. Elizalde, K. Ni, D. Poland, D. Borth, and L.-J. Li. 2016. YFCC100M: The new data in multimedia research. Comm. ACM, Vol. 59, 2 (Jan. 2016), 64--73. https://doi.org/10.1145/2812802
[13]
L. L. Thurstone. 1927. The method of paired comparisons for social values. J Abnorm Psychol, Vol. 21, 4 (1927), 384--400.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '19: Proceedings of the 27th ACM International Conference on Multimedia
October 2019
2794 pages
ISBN:9781450368896
DOI:10.1145/3343031
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 October 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. computational psycholinguistics
  2. language and vision
  3. multimedia modeling
  4. visual concept semantics

Qualifiers

  • Research-article

Funding Sources

  • JSPS Kakenhi

Conference

MM '19
Sponsor:

Acceptance Rates

MM '19 Paper Acceptance Rate 252 of 936 submissions, 27%;
Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 104
    Total Downloads
  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media