Abstract
Chinese painting poetry is an extraordinary aesthetic phenomenon in world art history. It is not only part of the paintings but also helps us to better understand the spiritual conception that the artists express. In this paper, we present an interactive visual system to enable ordinary users to compose customized painting poetry for ancient Chinese paintings, which contain three properties: (1) We employ object detection and image captioning to describe the scenery depicted in the painting. (2) We extend the modern color theory to analyze the underlying emotions of each painting. (3) We propose an interactive poetry generation method that takes the content description and the emotional expression to add the diversity of the poetry creation. Several visual components are carefully designed to visualize and contextualize the features in the painting. They effectively guide users to steer the creation of personalized painting poems. We conduct efficient case studies and user interviews to demonstrate the effectiveness of our system.
Graphic abstract
Similar content being viewed by others
References
Anderson P, Fernando B, Johnson M, Gould S (2016) Spice: Semantic propositional image caption evaluation. In: Leibe B, Matas J, Sebe N, Welling M (eds) Computer Vision - ECCV 2016. Springer, pp 382–398
Chen H, Yi X, Sun M, Li W, Yang C, Guo Z (2019) Sentiment-controllable chinese poetry generation. pp 4925–4931
Cheng W-F, Wu C-C, Song R, Fu J, Xie X, Nie J-Y (2018) Image inspired poetry generation in xiaoice. arXiv preprintarXiv:1808.03090
Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprintarXiv:1406.1078
Giovannangeli L, Bourqui R, Giot R, Auber D (2020) Toward automatic comparison of visualization techniques: application to graph visualization. Vis Inform 4(2):86–98
Han D, Pan J, Zhao X, Chen W (2021) Netv. js: a web-based library for high-efficiency visualization of large-scale graphs and networks. Vis Inform 5(1):61–66
Hu H (2018) Visualization design and research of the style and sects change of song ci. Harbin Institute Of Technology (Master’s thesis)
Hu Z, Yang Z, Liang X, Salakhutdinov R, Xing EP (2017) Toward controlled generation of text. In: Proceedings of ICML, pp 1587–1596
Huang J, Rathod V, Sun C, Zhu M, Korattikara A, Fathi A, Fischer I, Wojna Z, Song Y, Guadarrama S, Murphy K (2017) Speed/accuracy trade-offs for modern convolutional object detectors. In: Proceedings of CVPR, pp 3296–3297
Johnson J, Krishna R, Stark M, Li L-J, Shamma DA, Bernstein MS, Fei-Fei L (2015) Image retrieval using scene graphs. In: Proceedings of CVPR, pp 3668–3678
Kaneko A, Komatsu A, Itoh T, Wang FY (2020) Painting image browser applying an associate-rule-aware multidimensional data visualization technique. Vis Comput Ind Biomed Art 3(1):1–13
Kang D, Shim H, Yoon K (2018) A method for extracting emotion using colors comprise the painting image. Multimed Tools Appl 77(4):4985–5002
Karpathy A, Li F (2015) Deep visual-semantic alignments for generating image descriptions. In: Proceedings of CVPR, pp 3128–3137
Kulkarni G, Premraj V, Ordonez V, Dhar S, Li S, Choi Y, Berg AC, Berg TL (2013) Babytalk: understanding and generating simple image descriptions. TPAMI 35(12):2891–2903
Leite RA, Arleo A, Sorger J, Gschwandtner T, Miksch S (2020) Hermes: guidance-enriched visual analytics for economic network exploration. Vis Inform 4(4):11–22
Li Y, Fujiwara T, Choi YK, Kim KK, Ma K-L (2020) A visual analytics system for multi-model comparison on clinical data predictions. Vis Inform 4(2):122–131
Liu L, Wan X, Guo Z (2018) Images2poem: Generating Chinese poetry from image streams. In: Proceedings of ACMMM, pp 1967–1975
Lu C, Krishna R, Bernstein M, Fei-Fei L (2016) Visual relationship detection with language priors, vol 9905, pp 852–869
Lu J, Xiong C, Parikh D, Socher R (2017) Knowing when to look: Adaptive attention via a visual sentinel for image captioning. In: Proceedings of CVPR, pp 3242–3250
McCurdy N, Lein J, Coles K, Meyer M (2015) Poemage: visualizing the sonic topology of a poem. TVCG 22(1):439–448
Meneses L, Furuta R (2015) Visualizing poetry: Tools for critical analysis. paj: J Init Digit Hum Med Cult 3:1
Newell A, Deng J (2017) Pixels to graphs by associative embedding, vol NIPS’17. Curran Associates Inc., Red Hook, NY, USA, pp 2168–2177
Pinaud B, Vallet J, Melançon G (2020) On visualization techniques comparison for large social networks overview: a user experiment. Vis Inform 4(4):23–34
Ren S, He K, Girshick R, Sun J (2016) Faster r-cnn: towards real-time object detection with region proposal networks. TPAMI 39(6):1137–1149
Schuster S, Krishna R, Chang A, Fei-Fei L, Manning C (2015) Generating semantically precise scene graphs from textual descriptions for improved image retrieval. pp 70–80
Shi L, Liao Q, Tong H, Hu Y, Wang C, Lin C, Qian W (2020) Oniongraph: Hierarchical topology+ attribute multivariate network visualization. Vis Inform 4(1):43–57
Shu X, Wu J, Wu X, Liang H, Cui W, Wu Y, Qu H (2021) Dancingwords: exploring animated word clouds to tell stories. J Vis 24(1):85–100
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint:1409.1556
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of CVPR, pp 2818–2826
Takahashi F, Kawabata Y (2018) The association between colors and emotions for emotional words and facial expressions. Color Res Appl 43(2):247–257
Vinyals O, Toshev A, Bengio S, Erhan D (2015) Show and tell: a neural image caption generator. In: Proceedings of CVPR, pp 3156–3164
Wang X, Zeng H, Wang Y, Wu A, Sun Z, Ma X, Qu H (2020) Voicecoach: Interactive evidence-based training for voice modulation skills in public speaking. In: Proceedings of CHI, pp 1–12. ACM
Wang Y, Haleem H, Shi C, Wu Y, Zhao X, Fu S, Qu H (2018) Towards easy comparison of local businesses using online reviews. Comput Gr Forum 37(3):63–74
Wang Z, He W, Wu H, Wu H, Li W, Wang H, Chen E (2016) Chinese poetry generation with planning based neural network. arXiv preprint arXiv:1610.09889
Wu L, Xu M, Qian S, Cui J (2020) Image to modern chinese poetry creation via a constrained topic-aware model. TOMM 16(2):1–21
Xu D, Zhu Y, Choy C, Fei-Fei L (2017) Scene graph generation by iterative message passing. pp 3097–3106
Xu K, Ba J, Kiros R, Cho K, Courville A, Salakhudinov R, Zemel R, Bengio Y (2015) Show, attend and tell: Neural image caption generation with visual attention. In: Proceedings of ICML, pp 2048–2057
Xu L, Jiang L, Qin C, Wang Z, Du D (2018) How images inspire poems: Generating classical chinese poetry from images with memory networks. In: Proceedings of AAAI, vol 32
Yan R (2016) i, poet: Automatic poetry composition through recurrent neural networks with iterative polishing schema. pp 2238–2244
Yang J, Fan J, Hubball D, Gao Y, Luo H, Ribarsky W, Ward M (2006) Semantic image browser: bridging information visualization with automated intelligent image analysis, pp 191–198
Yang X, Tang K, Zhang H, Cai J (2019) Auto-encoding scene graphs for image captioning. In: Proceedings of CVPR, pp 10677–10686
Yi X, Li R, Yang C, Li W, Sun M (2020) Mixpoet: diverse poetry generation via learning controllable mixed latent space. Proc AAAI 34:9450–9457
Yi X, Sun M, Li R, Yang Z (2018) Chinese poetry generation with a working memory model. arXiv preprint arXiv:1809.04306
Zhang W, Siwei T, Liu K, Lei S, Chen S, Chen W (2019) A new perspective on the study of literature (songci): text correlation and spatio-temporal visual analytics. J Comput-Aided Des Comput Gr 31(10):1687–1697
Zhang X, Lapata M (2014) Chinese poetry generation with recurrent neural networks. In: Proceedings of EMNLP, pp 670–680
Zhao Y, Jiang H, Qin Y, Xie H, Wu Y, Liu S, Zhou Z, Xia J, Zhou F et al (2020) Preserving minority structures in graph sampling. IEEE Trans Vis Comput Gr 27(2):1698–1708
Zhao Y, Luo X, Lin X, Wang H, Kui X, Zhou F, Wang J, Chen Y, Chen W (2019) Visual analytics for electromagnetic situation awareness in radio monitoring and management. IEEE Trans Vis Comput Gr 26(1):590–600
Zhou F, Lin X, Liu C, Zhao Y, Xu P, Ren L, Xue T, Ren L (2019) A survey of visualization for smart manufacturing. J Vis 22(2):419–435
Zhou H, Huang M, Zhang T, Zhu X, Liu B (2018) Emotional chatting machine: emotional conversation generation with internal and external memory. In: Proceedings of AAAI, vol 32
Acknowledgements
This work is supported by National Natural Science Foundation of China (61972122, 61772456).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Feng, Y., Chen, J., Huang, K. et al. iPoet: interactive painting poetry creation with visual multimodal analysis. J Vis 25, 671–685 (2022). https://doi.org/10.1007/s12650-021-00780-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12650-021-00780-0