Iris: a multi-constraint graphic layout generation system

Chen, Liuqing; Jing, Qianzhi; Tsang, Yixin; Zhou, Tingting

doi:10.1631/FITEE.2300312

Iris: a multi-constraint graphic layout generation system

Iris:一个满足多条件约束的图形布局生成系统

Research Article
Published: 27 July 2024

Volume 25, pages 968–987, (2024)
Cite this article

Frontiers of Information Technology & Electronic Engineering Aims and scope Submit manuscript

Liuqing Chen (陈柳青) ORCID: orcid.org/0000-0002-9049-0394^1,2,
Qianzhi Jing (景千芝)¹,
Yixin Tsang (曾怡欣)¹ &
…
Tingting Zhou (周婷婷)³

248 Accesses
Explore all metrics

Abstract

In graphic design, layout is a result of the interaction between the design elements in the foreground and background images. However, prevalent research focuses on enhancing the quality of layout generation algorithms, overlooking the interaction and controllability that are essential for designers when applying these methods in real-world situations. This paper proposes a user-centered layout design system, Iris, which provides designers with an interactive environment to expedite the workflow, and this environment encompasses the features of user-constraint specification, layout generation, custom editing, and final rendering. To satisfy the multiple constraints specified by designers, we introduce a novel generation model, multi-constraint LayoutVQ-VAE, for advancing layout generation under intra- and inter-domain constraints. Qualitative and quantitative experiments on our proposed model indicate that it outperforms or is comparable to prevalent state-of-the-art models in multiple aspects. User studies on Iris further demonstrate that the system significantly enhances design efficiency while achieving human-like layout designs.

摘要

在平面设计中,布局是前景设计元素和背景图像相互作用的结果。然而,现有的研究主要集中在提高布局生成算法性能上,忽略设计师在现实世界中应用这些方法时所必需的交互性和可控性。本文提出一个以用户为中心的布局设计系统Iris,它为设计师提供了一个交互式的环境加快工作流程。该环境支持用户约束输入、布局生成、自定义编辑和布局渲染。为满足设计师指定的多种约束,引入一种新的生成模型——多约束 LayoutVQ-VAE,以推进在域内和域间多种条件约束下的布局生成。对所提模型进行定性和定量实验。实验结果表明,该模型在多个方面的表现优于目前最先进的模型或可与之相媲美。对Iris系统的用户研究进一步表明,该系统在显著提高设计效率的同时,也实现了接近人类设计师的布局设计。

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Procedural Approach for Configuration of Residential Activities Based on Users' Needs and Architectural Guidelines

Article 19 May 2022

Automatic Generation of User Interface Layouts for Alternative Screen Orientations

Self-refined variational transformer for image-conditioned layout generation

Article 16 September 2024

Data availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

References

Arroyo DM, Postels J, Tombari F, 2021. Variational Transformer networks for layout generation. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.13637–13647. https://doi.org/10.1109/CVPR46437.2021.01343
Ba JL, Kiros JR, Hinton GE, 2016. Layer normalization. https://arxiv.org/abs/1607.06450
Bangor A, Kortum P, Miller J, 2009. Determining what individual SUS scores mean: adding an adjective rating scale. J Usabil Stud, 4(3):114–123.
Google Scholar
Cao YN, Ma Y, Zhou M, et al., 2022. Geometry aligned variational Transformer for image-conditioned layout generation. Proc 30^th ACM Int Conf on Multimedia, p.1561–1571. https://doi.org/10.1145/3503161.3548332
Dayama NR, Todi K, Saarelainen T, et al., 2020. GRIDS: interactive layout design with integer programming. Proc CHI Conf on Human Factors in Computing Systems, p.1–13. https://doi.org/10.1145/3313831.3376553
Deka B, Huang ZF, Franzen C, et al., 2017. Rico: a mobile App dataset for building data-driven design applications. Proc 30^th Annual ACM Symp on User Interface Software and Technology, p.845–854. https://doi.org/10.1145/3126594.3126651
Devlin J, Chang MW, Lee K, et al., 2019. BERT: pre-training of deep bidirectional Transformers for language understanding. Proc Conf of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, p.4171–4186.
Dosovitskiy A, Beyer L, Kolesnikov A, et al., 2021. An image is worth 16×16 words: Transformers for image recognition at scale. Proc 9^th Int Conf on Learning Representations.
Guo SN, Jin ZC, Sun FL, et al., 2021. Vinci: an intelligent graphic design system for generating advertising posters. Proc CHI Conf on Human Factors in Computing Systems, Article 577. https://doi.org/10.1145/3411764.3445117
Gupta K, Lazarow J, Achille A, et al., 2021. LayoutTransformer: layout generation and completion with self-attention. Proc IEEE/CVF Int Conf on Computer Vision, p.984–994. https://doi.org/10.1109/ICCV48922.2021.00104
Hart SG, Staveland LE, 1988. Development of NASA-TLX (task load index): results of empirical and theoretical research. Adv Psychol, 52:139–183. https://doi.org/10.1016/S0166-4115(08)62386-9
Article Google Scholar
He KM, Zhang XY, Ren SQ, et al., 2016. Deep residual learning for image recognition. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.770–778. https://doi.org/10.1109/CVPR.2016.90
Heusel M, Ramsauer H, Unterthiner T, et al., 2017. GANs trained by a two time-scale update rule converge to a local Nash equilibrium. Proc 30^th Int Conf on Neural Information Processing Systems, p.6626–6637.
Hsu H, He XT, Peng YX, et al., 2023. PosterLayout: a new benchmark and approach for content-aware visual-textual presentation layout. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.6018–6026. https://doi.org/10.1109/CVPR52729.2023.00583
Hui MD, Zhang ZZ, Zhang XY, et al., 2023. Unifying layout generation with a decoupled diffusion model. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.1942–1951. https://doi.org/10.1109/CVPR52729.2023.00193
Inoue N, Kikuchi K, Simo-Serra E, et al., 2023. LayoutDM: discrete diffusion model for controllable layout generation. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.10167–10176. https://doi.org/10.1109/CVPR52729.2023.00980
Jacobs C, Li W, Schrier E, et al., 2003. Adaptive grid-based document layout. ACM Trans Graph, 22(3):838–847. https://doi.org/10.1145/882262.882353
Article Google Scholar
Jiang ZY, Sun SZ, Zhu JH, et al., 2022. Coarse-to-fine generative modeling for graphic layouts. Proc 36^th AAAI Conf on Artificial Intelligence, p.1096–1103. https://doi.org/10.1609/aaai.v36i1.19994
Jiang ZY, Guo JQ, Sun SZ, et al., 2023. LayoutFormer++: conditional graphic layout generation via constraint serialization and decoding space restriction. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.18403–18412. https://doi.org/10.1109/CVPR52729.2023.01765
Jing QZ, Zhou TT, Tsang Y, et al., 2023. Layout generation for various scenarios in mobile shopping applications. Proc CHI Conf on Human Factors in Computing Systems, Article 130. https://doi.org/10.1145/3544548.3581446
Kaiser L, Bengio S, Roy A, et al., 2018. Fast decoding in sequence models using discrete latent variables. Proc 35^th Int Conf on Machine Learning, p.2395–2404.
Kikuchi K, Simo-Serra E, Otani M, et al., 2021. Constrained graphic layout generation via latent optimization. Proc 29^th ACM Int Conf on Multimedia, p.88–96. https://doi.org/10.1145/3474085.3475497
Kong X, Jiang L, Chang HW, et al., 2022. BLT: bidirectional layout transformer for controllable layout generation. Proc 17^th European Conf on Computer Vision, p.474–490. https://doi.org/10.1007/978-3-031-19790-1_29
Li JN, Yang JM, Hertzmann A, et al., 2019. LayoutGAN: generating graphic layouts with wireframe discriminators. Proc 7^th Int Conf on Learning Representations.
Li JN, Yang JM, Zhang JM, et al., 2021. Attribute-conditioned layout GAN for automatic graphic design. IEEE Trans Vis Comput Graph, 27(10):4039–4048. https://doi.org/10.1109/TVCG.2020.2999335
Article Google Scholar
Lin TY, Dollár P, Girshick R, et al., 2017. Feature pyramid networks for object detection. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.2117–2125. https://doi.org/10.1109/CVPR.2017.106
O’Donovan P, Agarwala A, Hertzmann A, 2014. Learning layouts for single-page graphic designs. IEEE Trans Vis Comput Graph, 20(8):1200–1213. https://doi.org/10.1109/TVCG.2014.48
Article Google Scholar
Paszke A, Gross S, Massa F, et al., 2019. PyTorch: an imperative style, high-performance deep learning library. Proc 32^nd Int Conf on Neural Information Processing Systems, p.8024–8035.
Schrier E, Dontcheva M, Jacobs C, et al., 2008. Adaptive layout for dynamically aggregated documents. Proc 13^th Int Conf on Intelligent User Interfaces, p.99–108. https://doi.org/10.1145/1378773.1378787
van den Oord A, Vinyals O, Kavukcuoglu K, 2017. Neural discrete representation learning. Proc 30^th Int Conf on Neural Information Processing Systems, p.6306–6315.
Vaswani A, Shazeer N, Parmar N, et al., 2017. Attention is all you need. Proc 30^th Int Conf on Neural Information Processing Systems, p.5998–6008.
Xu CC, Zhou M, Ge TZ, et al., 2023. Unsupervised domain adaption with pixel-level discriminator for image-aware layout generation. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.10114–10123. https://doi.org/10.1109/CVPR52729.2023.00975
You WT, Jiang H, Yang ZY, et al., 2020. Automatic synthesis of advertising images according to a specified style. Front Inform Technol Electron Eng, 21(10):1455–1466. https://doi.org/10.1631/FITEE.1900367
Article Google Scholar
Zheng XR, Qiao XT, Cao Y, et al., 2019. Content-aware generative modeling of graphic design layouts. ACM Trans Graph, 38(4):133. https://doi.org/10.1145/3306346.3322971
Article Google Scholar
Zhong X, Tang JB, Yepes AJ, 2019. PubLayNet: largest dataset ever for document layout analysis. Proc Int Conf on Document Analysis and Recognition, p.1015–1022. https://doi.org/10.1109/ICDAR.2019.00166
Zhou M, Xu CC, Ma Y, et al., 2022. Composition-aware graphic layout GAN for visual-textual presentation designs. Proc 31^st Int Joint Conf on Artificial Intelligence, p.4995–5001.

Download references

Author information

Authors and Affiliations

College of Computer Science and Technology, Zhejiang University, Hangzhou, 310030, China
Liuqing Chen (陈柳青), Qianzhi Jing (景千芝) & Yixin Tsang (曾怡欣)
Zhejiang–Singapore Innovation and AI Joint Research Lab, Hangzhou, 310058, China
Liuqing Chen (陈柳青)
Alibaba Group, Hangzhou, 310034, China
Tingting Zhou (周婷婷)

Authors

Liuqing Chen (陈柳青)
View author publications
Search author on:PubMed Google Scholar
Qianzhi Jing (景千芝)
View author publications
Search author on:PubMed Google Scholar
Yixin Tsang (曾怡欣)
View author publications
Search author on:PubMed Google Scholar
Tingting Zhou (周婷婷)
View author publications
Search author on:PubMed Google Scholar

Contributions

Liuqing CHEN, Qianzhi JING, and Yixin TSANG designed the research. Qianzhi JING and Yixin TSANG processed the data. Liuqing CHEN, Qianzhi JING, and Yixin TSANG drafted the paper. Liuqing CHEN and Tingting ZHOU revised and finalized the paper.

Corresponding author

Correspondence to Liuqing Chen (陈柳青).

Ethics declarations

All the authors declare that they have no conflict of interest.

Additional information

Project supported by the Alibaba–Zhejiang University Joint Research Institute of Frontier Technologies, China and the Zhejiang–Singapore Innovation and AI Joint Research Lab, China

List of supplementary materials

1 Examples for poster and magazine layout generation

2 Case study with Iris and Midjourney

3 Leveraging Midjourney for poster layout generation

Fig. S1 Examples for PDCard and magazine graphic layout design by Iris

Fig. S2 Multi-constraint poster layout generation comparison

Table S1 Prompts used and the corresponding generation results during the six iterations of layout generation using Midjourney

Electronic supplementary material

Appendix

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, L., Jing, Q., Tsang, Y. et al. Iris: a multi-constraint graphic layout generation system. Front Inform Technol Electron Eng 25, 968–987 (2024). https://doi.org/10.1631/FITEE.2300312

Download citation

Received: 30 April 2023
Accepted: 09 November 2023
Published: 27 July 2024
Issue Date: July 2024
DOI: https://doi.org/10.1631/FITEE.2300312

Key words

关键词

CLC number

TP302

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Iris: a multi-constraint graphic layout generation system

Abstract

摘要

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Procedural Approach for Configuration of Residential Activities Based on Users' Needs and Architectural Guidelines

Automatic Generation of User Interface Layouts for Alternative Screen Orientations

Self-refined variational transformer for image-conditioned layout generation

Explore related subjects

Data availability

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Additional information

List of supplementary materials

Electronic supplementary material

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Key words

关键词

CLC number

Subscribe and save

Buy Now