short-paper

Scene Image Synthesis from Natural Sentences Using Hierarchical Syntactic Analysis

Authors:

Hiroaki Yamane,

Tatsuya HaradaAuthors Info & Claims

MM '16: Proceedings of the 24th ACM international conference on Multimedia

Pages 112 - 116

https://doi.org/10.1145/2964284.2967193

Published: 01 October 2016 Publication History

Abstract

Synthesizing a new image from verbal information is a challenging task that has a number of applications. Most research on the issue has attempted to address this question by providing external clues, such as sketches. However, no study has been able to successfully handle various sentences for this purpose without any other information. We propose a system to synthesize scene images solely from sentences. Input sentences are expected to be complete sentences with visualizable objects. Our priorities are the analysis of sentences and the correlation of information between input sentences and visible image patches. A hierarchical syntactic parser is developed for sentence analysis, and a combination of lexical knowledge and corpus statistics is designed for word correlation. The entire system was applied to both a clip-art dataset and an actual image dataset. This application highlighted the capability of the proposed system to generate novel images as well as its ability to succinctly convey ideas.

References

[1]

R. Achanta, S. Appu, S. Kevin, L. Aurelien, F. Pascal, and S. Sabine. Slic superpixels compared to state-of-the-art superpixel methods. PAMI, 34(11):2274--2282, 2012.

Digital Library

[2]

T. Chen, M.-M. Cheng, P. Tan, A. Shamir, and S.-M. Hu. Sketch2photo: internet image montage. ACM TOG, 28(5):124:1--10, 2009.

Digital Library

[3]

M.-C. De Marneffe, B. MacCartney, C. D. Manning, et al. Generating typed dependency parses from phrase structure parses. In ICLRE, 2006.

[4]

J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. Imagenet: A large-scale hierarchical image database. In CVPR, pages 248--255, 2009.

[5]

C. Fellbaum. WordNet: An Electronic Lexical Database. Bradford Books, 1998.

[6]

M. Hodosh, P. Young, and J. Hockenmaier. Framing image description as a ranking task: Data, models and evaluation metrics. JAIR, 47:853--899, 2013.

[7]

S. Inaba, A. Kanezaki, and T. Harada. Automatic image synthesis from keywords using scene context. In ACMMM, pages 1149--1152, 2014.

Digital Library

[8]

L. Itti, C. Koch, and E. Niebur. A model of saliency-based visual attention for rapid scene analysis. PAMI, 20(11):1254--1259, 1998.

Digital Library

[9]

Y. Li, D. McLean, Z. Bandar, J. D. O'shea, K. Crockett, et al. Sentence similarity based on semantic nets and corpus statistics. TKDE, 18(8):1138--1150, 2006.

Digital Library

[10]

T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick. Microsoft coco: Common objects in context. In ECCV, pages 740--755, 2014.

[11]

T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In NIPS, 2013.

Digital Library

[12]

K. K. Schuler. Verbnet: A Broad-coverage, Comprehensive Verb Lexicon. PhD thesis, Philadelphia, PA, USA, 2005. AAI3179808.

Digital Library

[13]

K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.

[14]

P. Young, A. Lai, M. Hodosh, and J. Hockenmaier. From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions. ACL, 2:67--78, 2014.

[15]

B. Zhou, A. Lapedriza, J. Xiao, A. Torralba, and A. Oliva. Learning deep features for scene recognition using places database. In NIPS, pages 487--495, 2014.

Digital Library

[16]

C. L. Zitnick, D. Parikh, and L. Vanderwende. Learning the visual interpretation of sentences. In ICCV, 2013.

Digital Library

Cited By

Kang RSunil AChen M(2019)Mobile App for Text-to-Image SynthesisMobile Computing, Applications, and Services10.1007/978-3-030-28468-8_3(32-43)Online publication date: 25-Sep-2019
https://doi.org/10.1007/978-3-030-28468-8_3

Index Terms

Scene Image Synthesis from Natural Sentences Using Hierarchical Syntactic Analysis
1. Computing methodologies
  1. Computer graphics
    1. Image manipulation
      1. Image processing
2. Information systems
  1. Information systems applications
    1. Multimedia information systems
      1. Multimedia content creation

Recommendations

Phrasal syntactic category sequence model for phrase-based MT
CICLing'12: Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part II

Incorporating target syntax into phrase-based machine translation (PBMT) can generate syntactically well-formed translations. We propose a novel phrasal syntactic category sequence (PSCS) model which allows a PBMT decoder to prefer more grammatical ...
A program for the syntactic analysis of English sentences

A program is described which produces syntactic analyses of English sentences with respect to a transformational grammar. The main features of the analyzer are that it uses only a limited dictionary of English words and that it pursues all analysis ...
Bottom-up context-sensitive algorithms for Bengali parser in natural language processing

This paper embodies the design of parsing algorithms tangibly for a Bengali parser. To design parsing algorithms a detailed study on linguistics and grammar has been performed. A detailed study also has been made on the various techniques and algorithms ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '16: Proceedings of the 24th ACM international conference on Multimedia

October 2016

1542 pages

ISBN:9781450336031

DOI:10.1145/2964284

General Chairs:
Alan Hanjalic
Delft University of Technology
,
Cees Snoek
Qualcomm Research Netherlands / University of Amsterdam
,
Marcel Worring
University of Amsterdam
,
Moderator:
Dick Bulterman
CWI / VU University Amsterdam
,
Program Chairs:
Benoit Huet
EURECOM
,
Aisling Kelliher
Virginia Tech
,
Yiannis Kompatsiaris
CERTH-ITI
,
Jin Li
Microsoft

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 October 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Funding Sources

ImPACT Program by the Cabinet Office Government of Japan

Conference

MM '16

Sponsor:

SIGMM

MM '16: ACM Multimedia Conference

October 15 - 19, 2016

Amsterdam, The Netherlands

Acceptance Rates

MM '16 Paper Acceptance Rate 52 of 237 submissions, 22%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
219
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)0

Reflects downloads up to 12 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Kang RSunil AChen M(2019)Mobile App for Text-to-Image SynthesisMobile Computing, Applications, and Services10.1007/978-3-030-28468-8_3(32-43)Online publication date: 25-Sep-2019
https://doi.org/10.1007/978-3-030-28468-8_3

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten