research-article

A Generative Approach to Enrich Arabic Story Text with Visual Aids

Authors:
Jezia Zakraoui

Department of Computer Science and Engineering, College of Engineering, Qatar university, Qatar

Department of Computer Science and Engineering, College of Engineering, Qatar university, Qatar
View Profile

,
Somaya Al Maadeed

Department of Computer Science and Engineering, College of Engineering, Qatar University, Qatar

Department of Computer Science and Engineering, College of Engineering, Qatar University, Qatar
View Profile

,
Mohamed Samir Abou El-Seoud

Faculty of Informatics and Computer Science, The British University in Egypt (BUE), Egypt

Faculty of Informatics and Computer Science, The British University in Egypt (BUE), Egypt
View Profile

,
Jihad M. Alja'am

Department of Computer Science and Engineering, College of Engineering, Qatar University, Qatar

Department of Computer Science and Engineering, College of Engineering, Qatar University, Qatar
View Profile

,
Moutaz Salah

Department of Computer Science and Engineering,College of Engineering, Qatar University, Qatar

Department of Computer Science and Engineering,College of Engineering, Qatar University, Qatar
View Profile

ICSIE '21: Proceedings of the 10th International Conference on Software and Information EngineeringNovember 2021Pages 47–52https://doi.org/10.1145/3512716.3512725

Published:07 March 2022Publication History

ICSIE '21: Proceedings of the 10th International Conference on Software and Information Engineering

Pages 47–52

ABSTRACT

Enriching the script of a story with visual aids is an effective approach for promoting language learning and literacy development for young children and learners. In this paper, we propose a new system, that can generate short Arabic stories with generated images that accurately represent the story, scene and context of the given input. We use a text generation technique with a text-to-image synthesis network and minimize the human intervention. We build a corpus of Arabic stories with vocabulary and visualizations. The obtained results with various generative models to create text-image contents show the effectiveness of the proposed approach. The system can be used in education and assist the instructors to build stories on different domains. It can be used in distance learning to deliver online tutorials during COVID-19.

References

M. Phillips, The Effects of Visual Vocabulary Strategies on Vocabulary Knowledge, Theses, Dissertations and Capstones, https://mds.marshall.edu/etd/987, 2016.Google Scholar
D. Kurniati, D. Rukmini, M. Saleh, D. Anggani and L. Bharati, "How is Picture Mnemonic Implemented in Teaching English Vocabulary to Students with Intellectual Disability," in Proceedings of the 1st International Conference on Science, Health, Economics, Education and Technology (ICoSHEET 2019), 2020.Google Scholar
D. S. Weisberg and E. J. Hopkins, "Preschoolers' extension and export of information from realistic and fantastical stories," Infant and Child Development, p. doi:10.1002/icd.2182, 2020.Google Scholar
L. Yao, N. Peng, R. M. Weischedel, K. Knight, D. Zhao and R. Yan, "Plan-and-write: Towards better automatic storytelling," in Proceedings of the Thirty-Third AAAI Conference onArtificial Intelligence, pages 7378–7385., 2019.Google Scholar
P. Tambwekar, M. Dhuliawala, L. J. Martin, A. Mehta, B. Harrison and M. O. Riedl, "Controllable neural story plot generation via reinforcement learning," 2019.Google Scholar
A. I. Alhussain and A. M. Azmi, "Automatic story generation: A Survey of Approaches," Association for Computing Machinery, vol. 54, no. 5, pp. Article 103 (June 2021), 38 pages. DOI:https://doi.org/10.1145/3453156, 2021.Google ScholarDigital Library
J. Zakraoui, M. Saleh, U. Asghar, J. M. AlJa'am and S. Al-Maadeed, "Generating Images from Arabic Story-Text using Scene Graph," in IEEE International Conference on Informatics, IoT, and Enabling Technologies (ICIoT), pp. 469-475, Doha, Qatar, 2020.Google Scholar
J. Zakraoui, M. Saleh, S. Al-Maadeed and J. M. Jaam, "Improving Text-to-Image Generation with Object Layout Guidance," Multimedia Tools Appl 80, 27423–27443, pp. https://doi.org/10.1007/s11042-021-11038-0, 2021.Google ScholarDigital Library
Z. Li, X. Ding, T. Liu, J. E. Hu and B. Van Durme, "Guided Generation of Cause and Effect," Proceedings of the 29th International Joint Conference on Artificial Intelligence, Christian Bessiere (Ed.)., pp. 3629-3636, 2020.Google Scholar
J. Guan, Y. Wang, Huang and Minlie, "Story ending generation with incremental encoding and commonsense knowledge.," in Proceedings of the 33rd AAAI Conference on Artificial Intelligence, Vol. 33., pp. 6473-6480, 2019.Google Scholar
S. Wang, G. Durrett and K. Erk, "Narrative Interpolation for Generating and Understanding Stories.," arXiv e-prints, 2020.Google Scholar
A. Holtzman, J. Buys, L. Du, M. Forbes and Y. Choi, "The Curious Case of Neural Text Degeneration," in International Conference on Learning Representations, 2019.Google Scholar
J. Guan, F. Huang, Z. Zhao, X. Zhu and M. Huang, "A knowledge-enhanced pretraining model for commonsense story generation," in Transactions of the Association for Computational Linguistics 8, 93-108, 2020.Google ScholarCross Ref
J. Zakraoui, M. Saleh, Aljaam and M. Jihad, "Text-to-picture Tools, Systems and Approaches: A Survey," Journal of Multimedia Tools and Applications, Springer, vol. 78, pp. 22833-22859, 2019.Google ScholarDigital Library
I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville and Y. Bengio, "Generative adversarial nets," In Advances in Neural Information Processing Systems, p. pages 2672–2680, 2014.Google ScholarDigital Library
H. Ravi, L. Wang, C. Muñiz, L. Sigal, N. D. Metaxas and M. Kapadia, "Show Me a Story: Towards Coherent Neural Story Illustration," CVPR, pp. 7613-7621, doi: 10.1109/CVPR.2018.00794, 2018.Google Scholar
T. Xu, P. Zhang, Q. Huang, H. Zhang, Z. Gan, X. Huang and X. He, "Attngan: Fine-grained text to image generation with attentional generative adversarial networks," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1316-1324, 2018.Google Scholar
L. Yitong, G. Zhe, S. Yelong, L. Jingjing, C. Yu, W. Yuexin, C. Lawrence, C. David and G. Jianfeng, "StoryGAN: A Sequential Conditional GAN for Story Visualization," CoRR, vol. abs/1812.02784, 2018.Google Scholar
N. Akoury, S. Wang, J. Whiting, S. Hood, N. Peng and M. Iyyer, "STORIUM: A Dataset and Evaluation Platform for Machine-in-the-Loop Story Generation," Arxiv: 2010.01717v1, 2020.Google Scholar
Y. Bengio, R. Ducharme, P. Vincent and C. Jauvin, "A neural probabilistic language model.," Journal of machine learning research, vol. 3 (Feb), pp. 1137-1155, 2003.Google ScholarDigital Library
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. Gomez, L. Kaiser and I. Polosukhin, "Attention is All you Need," in The 31st International Conference on Neural Information Processing, pp. 6000-6010, 2017.Google Scholar
A. Radford, K. Narasimhan, T. Salimans and I. Sutskever, "Improving language understanding by generative pre-training," 2018.Google Scholar
T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz and J. Brew, "Huggingface's transformers: State-of-the-art natural language processing," ArXiv, abs/1910.03771, 2019.Google Scholar
A. Alabdulkarim, S. Li and X. Peng, "Automatic Story Generation: Challenges and Attempts," in Proceedings of the 3rd Workshop on Narrative Understanding, pages 72–83, 2021.Google Scholar
A. Brock, J. Donahue and K. Simonyan, "Large scale GAN training for high fidelity natural image synthesis," CoRR, abs/1809.11096, 2018.Google Scholar
A. e. a. Radford, "Learning transferable visual models from natural language supervision," arXiv:2103.00020, 2021.Google Scholar
Z. Gangyan, L. Zhaohui and Z. Yuan, "PororoGAN: An Improved Story Visualization Model on Pororo-SV Dataset," in 3rd International Conference on Computer Science and Artificial Intelligence, Normal IL USA, 2019.Google Scholar
A. Radford, J. Wu, R. Child, D. Luan, D. Amodei and I. Sutskever, "Language models are unsupervised multitask learners," OpenAI blog, 1(8), 9, 2019.Google Scholar
W. Antoun, F. Baly and H. Hajj, "ARAGPT2: Pre-Trained Transformer for Arabic Language Generation," ArXiv:2012.15520v2, 2021.Google Scholar
A. Ramesh, M. Pavlov, G. Goh, S. Gray, C. Voss, A. Radford, M. Chen and I. Sutskever, "Zero-Shot Text-to-Image Generation," ArXiv:2102.12092, 2021.Google Scholar

Index Terms

A Generative Approach to Enrich Arabic Story Text with Visual Aids

Index terms have been assigned to the content through auto-classification.

Recommendations

Narrative abstraction model for story-oriented video
MULTIMEDIA '04: Proceedings of the 12th annual ACM international conference on Multimedia

TV program review services, especially drama review services, are one of the most popular video on demand services on the Web. In this paper, we propose a novel video abstraction model for a review service of story-oriented video such as dramas. In a ...
Read More
Character-Preserving Coherent Story Visualization
Computer Vision – ECCV 2020
Abstract
Story visualization aims at generating a sequence of images to narrate each sentence in a multi-sentence story. Different from video generation that focuses on maintaining the continuity of generated images (frames), story visualization emphasizes ...
Read More
A narrative-based abstraction framework for story-oriented video

This article proposes a novel video abstraction framework for online review services of story-oriented videos such as dramas. Among the many genres of TV programs, a drama is one of the most popularly watched on the Web. The abstracts generated by the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ICSIE '21: Proceedings of the 10th International Conference on Software and Information Engineering
November 2021
62 pages
ISBN:9781450384315
DOI:10.1145/3512716

Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 7 March 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
CLIP
GAN
Story Understanding
Story visualization
Visual Language Learning
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 110
  Total Downloads
- Downloads (Last 12 months)72
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

A Generative Approach to Enrich Arabic Story Text with Visual Aids

ICSIE '21: Proceedings of the 10th International Conference on Software and Information Engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

Narrative abstraction model for story-oriented video

Character-Preserving Coherent Story Visualization

A narrative-based abstraction framework for story-oriented video

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

A Generative Approach to Enrich Arabic Story Text with Visual Aids

ICSIE '21: Proceedings of the 10th International Conference on Software and Information Engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

Narrative abstraction model for story-oriented video

Character-Preserving Coherent Story Visualization

A narrative-based abstraction framework for story-oriented video

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media