poster

Automatic Image Synthesis from Keywords Using Scene Context

Authors:

Sho Inaba,

Asako Kanezaki,

Tatsuya HaradaAuthors Info & Claims

MM '14: Proceedings of the 22nd ACM international conference on Multimedia

Pages 1149 - 1152

https://doi.org/10.1145/2647868.2655009

Published: 03 November 2014 Publication History

Get Access

Abstract

Text is one of the simplest way to express one's idea, and an image is one of the most impactive way to do so. Therefore, if a system can synthesize an image from text without direct user manipulation, novel image synthesis applications will be opened to users without artistic skills. In such a system, which objects to synthesize will be declared in texts. However, information about positional relations and scale of objects is not much provided and must be estimated using common sense. As described in this paper, we develop a system that can automatically synthesize objects to an image, given the background image and class name of the target synthesizing object. With the inputs as the background image and keywords, images for synthesizing objects are searched automatically. Although some previously developed systems that can synthesize an image from sketches and paintings, this is the first system that can estimate the position, scale, and appearance of objects and automatically synthesize them to images without direct user input. We propose a scene context, which indicates the position, scale, and appearance of synthesizing objects. The contribution of this paper is twofold: (1) the scene context extraction method for automatic image synthesis and (2) application of automatic image synthesis using the scene context.

References

[1]

T. Chen, M.-M. Cheng, P. Tan, A. Shamir, and S.-M. Hu. Sketch2Photo: internet image montage. ACM Transactions on Graphics (SIGGRAPH Asia), 28(5):124:1--124:10, 2009.

Digital Library

Google Scholar

[2]

M. J. Choi, J. J. Lim, A. Torralba, and A. S. Willsky. Exploiting hierarchical context on a large database of object categories. In Proc. of CVPR, pages 129--136, 2010.

Crossref

Google Scholar

[3]

N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In Proc. of CVPR, pages 886--893, 2005.

Digital Library

Google Scholar

[4]

P. F. Felzenszwalb, R. B. Girshick, and D. Ramanan. Object detection with discriminatively trained part based models. IEEE Transaction on Pattern Analysis and Machine Intelligence, 32(9):1627--1645, 2010.

Digital Library

Google Scholar

[5]

C. Galleguillous and S. Belongie. Context based object categorization: A critical survey. Computer Vision and Image Understanding, 114(6):712--722, 2010.

Digital Library

Google Scholar

[6]

S. Goferman, A. Tal, and L. Zelnik-Manor. Puzzle-like collage. Computer Graphics Forum (EuroGraphics), 29(2):459--468, 2010.

Google Scholar

[7]

J. Hays and A. A. Efros. Scene completion using millions of photographs. ACM Transactions on Graphics (SIGGRAPH), 26(3):4, 2007.

Digital Library

Google Scholar

[8]

H. Huang, L. Zhang, and H.-C. Zhang. Arcimboldo-like collage using internet images. ACM Transactions on Graphics (SIGGRAPH Asia), 30(6):155:1--155:8, 2011.

Digital Library

Google Scholar

[9]

M. Johnson, G. J. Brostow, J. Shotton, O. Arandjelović, V. Kwatra, and R. Cipolla. Semantic photo synthesis. Computer Graphics Forum (EuroGraphics), 25(3):407--413, 2006.

Crossref

Google Scholar

[10]

T. Malisiewicz, A. Gupta, and A. A. Efros. Ensemble of exemplar-svms for object detection and beyond. In Proc. of ICCV, pages 89--96, 2011.

Digital Library

Google Scholar

[11]

A. Oliva and A. Torralba. Modeling the shape of the scene: a holistic representation of the spatial envelope. Internationali Journal of Computer Vision, 42(3):145--175, 2001.

Digital Library

Google Scholar

[12]

C. Rother, L. Bordeaux, Y. Hamadi, and A. Blake. AutoCollage. ACM Transactions on Graphics (SIGGRAPH), 25(3):847--852, 2006.

Digital Library

Google Scholar

[13]

C. L. Zitnick, D. Parikh, and L. Vanderwende. Learning the visual interpretation of sentences. In Proc. of ICCV, pages 1681--1688, 2013.

Digital Library

Google Scholar

Cited By

View all

Kang RSunil AChen M(2019)Mobile App for Text-to-Image SynthesisMobile Computing, Applications, and Services10.1007/978-3-030-28468-8_3(32-43)Online publication date: 25-Sep-2019
https://doi.org/10.1007/978-3-030-28468-8_3
Shi HLi HWu QMeng FNgan KBoll SMu Lee KLuo JZhu WByun HWen Chen CLienhart RMei T(2018)Boosting Scene Parsing Performance via Reliable Scale PredictionProceedings of the 26th ACM international conference on Multimedia10.1145/3240508.3240657(492-500)Online publication date: 15-Oct-2018
https://dl.acm.org/doi/10.1145/3240508.3240657
Mano TYamane HHarada THanjalic ASnoek CWorring MBulterman DHuet BKelliher AKompatsiaris YLi J(2016)Scene Image Synthesis from Natural Sentences Using Hierarchical Syntactic AnalysisProceedings of the 24th ACM international conference on Multimedia10.1145/2964284.2967193(112-116)Online publication date: 1-Oct-2016
https://dl.acm.org/doi/10.1145/2964284.2967193

Index Terms

Automatic Image Synthesis from Keywords Using Scene Context
1. Computing methodologies
  1. Computer graphics
    1. Image manipulation
    2. Rendering

Recommendations

SystemCoDesigner—an automatic ESL synthesis approach by design space exploration and behavioral synthesis for streaming applications

With increasing design complexity, the gap from ESL (Electronic System Level) design to RTL synthesis becomes more and more crucial to many industrial projects. Although several behavioral synthesis tools exist to automatically generate synthesizable ...
Image Classification With Kernelized Spatial-Context

The goal of image classification is to classify a collection of unlabeled images into a set of semantic classes. Many methods have been proposed to approach this goal by leveraging visual appearances of local patches in images. However, the spatial ...
Interpreting Context of Images Using Scene Graphs
Big Data Analytics
Abstract
Understanding a visual scene incorporates objects, relationships, and context. Traditional methods working on an image mostly focus on object detection and fail to capture the relationship between the objects. Relationships can give rich semantic ...

Comments

Information & Contributors

Information

Published In

MM '14: Proceedings of the 22nd ACM international conference on Multimedia

November 2014

1310 pages

ISBN:9781450330633

DOI:10.1145/2647868

General Chairs:
Kien A. Hua
University of Central Florida, USA
,
Yong Rui
Microsoft Research, China
,
Ralf Steinmetz
Technische Universitt Darmstadt, Germany
,
Program Chairs:
Alan Hanjalic
Delft University of Technology, Netherlands
,
Apostol (Paul) Natsev
Google, USA
,
Wenwu Zhu
Tsinghua University, China

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 November 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Poster

Conference

MM '14

Sponsor:

SIGMM

MM '14: 2014 ACM Multimedia Conference

November 3 - 7, 2014

Florida, Orlando, USA

Acceptance Rates

MM '14 Paper Acceptance Rate 55 of 286 submissions, 19%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
184
Total Downloads

Downloads (Last 12 months)4
Downloads (Last 6 weeks)0

Reflects downloads up to 19 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Kang RSunil AChen M(2019)Mobile App for Text-to-Image SynthesisMobile Computing, Applications, and Services10.1007/978-3-030-28468-8_3(32-43)Online publication date: 25-Sep-2019
https://doi.org/10.1007/978-3-030-28468-8_3
Shi HLi HWu QMeng FNgan KBoll SMu Lee KLuo JZhu WByun HWen Chen CLienhart RMei T(2018)Boosting Scene Parsing Performance via Reliable Scale PredictionProceedings of the 26th ACM international conference on Multimedia10.1145/3240508.3240657(492-500)Online publication date: 15-Oct-2018
https://dl.acm.org/doi/10.1145/3240508.3240657
Mano TYamane HHarada THanjalic ASnoek CWorring MBulterman DHuet BKelliher AKompatsiaris YLi J(2016)Scene Image Synthesis from Natural Sentences Using Hierarchical Syntactic AnalysisProceedings of the 24th ACM international conference on Multimedia10.1145/2964284.2967193(112-116)Online publication date: 1-Oct-2016
https://dl.acm.org/doi/10.1145/2964284.2967193

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

References

Cited By

Index Terms

Recommendations

SystemCoDesigner—an automatic ESL synthesis approach by design space exploration and behavioral synthesis for streaming applications

Image Classification With Kernelized Spatial-Context

Interpreting Context of Images Using Scene Graphs

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations