skip to main content
10.1145/3512716.3512725acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicsieConference Proceedingsconference-collections
research-article

A Generative Approach to Enrich Arabic Story Text with Visual Aids

Authors Info & Claims
Published:07 March 2022Publication History

ABSTRACT

Enriching the script of a story with visual aids is an effective approach for promoting language learning and literacy development for young children and learners. In this paper, we propose a new system, that can generate short Arabic stories with generated images that accurately represent the story, scene and context of the given input. We use a text generation technique with a text-to-image synthesis network and minimize the human intervention. We build a corpus of Arabic stories with vocabulary and visualizations. The obtained results with various generative models to create text-image contents show the effectiveness of the proposed approach. The system can be used in education and assist the instructors to build stories on different domains. It can be used in distance learning to deliver online tutorials during COVID-19.

References

  1. M. Phillips, The Effects of Visual Vocabulary Strategies on Vocabulary Knowledge, Theses, Dissertations and Capstones, https://mds.marshall.edu/etd/987, 2016.Google ScholarGoogle Scholar
  2. D. Kurniati, D. Rukmini, M. Saleh, D. Anggani and L. Bharati, "How is Picture Mnemonic Implemented in Teaching English Vocabulary to Students with Intellectual Disability," in Proceedings of the 1st International Conference on Science, Health, Economics, Education and Technology (ICoSHEET 2019), 2020.Google ScholarGoogle Scholar
  3. D. S. Weisberg and E. J. Hopkins, "Preschoolers' extension and export of information from realistic and fantastical stories," Infant and Child Development, p. doi:10.1002/icd.2182, 2020.Google ScholarGoogle Scholar
  4. L. Yao, N. Peng, R. M. Weischedel, K. Knight, D. Zhao and R. Yan, "Plan-and-write: Towards better automatic storytelling," in Proceedings of the Thirty-Third AAAI Conference onArtificial Intelligence, pages 7378–7385., 2019.Google ScholarGoogle Scholar
  5. P. Tambwekar, M. Dhuliawala, L. J. Martin, A. Mehta, B. Harrison and M. O. Riedl, "Controllable neural story plot generation via reinforcement learning," 2019.Google ScholarGoogle Scholar
  6. A. I. Alhussain and A. M. Azmi, "Automatic story generation: A Survey of Approaches," Association for Computing Machinery, vol. 54, no. 5, pp. Article 103 (June 2021), 38 pages. DOI:https://doi.org/10.1145/3453156, 2021.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. Zakraoui, M. Saleh, U. Asghar, J. M. AlJa'am and S. Al-Maadeed, "Generating Images from Arabic Story-Text using Scene Graph," in IEEE International Conference on Informatics, IoT, and Enabling Technologies (ICIoT), pp. 469-475, Doha, Qatar, 2020.Google ScholarGoogle Scholar
  8. J. Zakraoui, M. Saleh, S. Al-Maadeed and J. M. Jaam, "Improving Text-to-Image Generation with Object Layout Guidance," Multimedia Tools Appl 80, 27423–27443, pp. https://doi.org/10.1007/s11042-021-11038-0, 2021.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Z. Li, X. Ding, T. Liu, J. E. Hu and B. Van Durme, "Guided Generation of Cause and Effect," Proceedings of the 29th International Joint Conference on Artificial Intelligence, Christian Bessiere (Ed.)., pp. 3629-3636, 2020.Google ScholarGoogle Scholar
  10. J. Guan, Y. Wang, Huang and Minlie, "Story ending generation with incremental encoding and commonsense knowledge.," in Proceedings of the 33rd AAAI Conference on Artificial Intelligence, Vol. 33., pp. 6473-6480, 2019.Google ScholarGoogle Scholar
  11. S. Wang, G. Durrett and K. Erk, "Narrative Interpolation for Generating and Understanding Stories.," arXiv e-prints, 2020.Google ScholarGoogle Scholar
  12. A. Holtzman, J. Buys, L. Du, M. Forbes and Y. Choi, "The Curious Case of Neural Text Degeneration," in International Conference on Learning Representations, 2019.Google ScholarGoogle Scholar
  13. J. Guan, F. Huang, Z. Zhao, X. Zhu and M. Huang, "A knowledge-enhanced pretraining model for commonsense story generation," in Transactions of the Association for Computational Linguistics 8, 93-108, 2020.Google ScholarGoogle ScholarCross RefCross Ref
  14. J. Zakraoui, M. Saleh, Aljaam and M. Jihad, "Text-to-picture Tools, Systems and Approaches: A Survey," Journal of Multimedia Tools and Applications, Springer, vol. 78, pp. 22833-22859, 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville and Y. Bengio, "Generative adversarial nets," In Advances in Neural Information Processing Systems, p. pages 2672–2680, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. H. Ravi, L. Wang, C. Muñiz, L. Sigal, N. D. Metaxas and M. Kapadia, "Show Me a Story: Towards Coherent Neural Story Illustration," CVPR, pp. 7613-7621, doi: 10.1109/CVPR.2018.00794, 2018.Google ScholarGoogle Scholar
  17. T. Xu, P. Zhang, Q. Huang, H. Zhang, Z. Gan, X. Huang and X. He, "Attngan: Fine-grained text to image generation with attentional generative adversarial networks," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1316-1324, 2018.Google ScholarGoogle Scholar
  18. L. Yitong, G. Zhe, S. Yelong, L. Jingjing, C. Yu, W. Yuexin, C. Lawrence, C. David and G. Jianfeng, "StoryGAN: A Sequential Conditional GAN for Story Visualization," CoRR, vol. abs/1812.02784, 2018.Google ScholarGoogle Scholar
  19. N. Akoury, S. Wang, J. Whiting, S. Hood, N. Peng and M. Iyyer, "STORIUM: A Dataset and Evaluation Platform for Machine-in-the-Loop Story Generation," Arxiv: 2010.01717v1, 2020.Google ScholarGoogle Scholar
  20. Y. Bengio, R. Ducharme, P. Vincent and C. Jauvin, "A neural probabilistic language model.," Journal of machine learning research, vol. 3 (Feb), pp. 1137-1155, 2003.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. Gomez, L. Kaiser and I. Polosukhin, "Attention is All you Need," in The 31st International Conference on Neural Information Processing, pp. 6000-6010, 2017.Google ScholarGoogle Scholar
  22. A. Radford, K. Narasimhan, T. Salimans and I. Sutskever, "Improving language understanding by generative pre-training," 2018.Google ScholarGoogle Scholar
  23. T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz and J. Brew, "Huggingface's transformers: State-of-the-art natural language processing," ArXiv, abs/1910.03771, 2019.Google ScholarGoogle Scholar
  24. A. Alabdulkarim, S. Li and X. Peng, "Automatic Story Generation: Challenges and Attempts," in Proceedings of the 3rd Workshop on Narrative Understanding, pages 72–83, 2021.Google ScholarGoogle Scholar
  25. A. Brock, J. Donahue and K. Simonyan, "Large scale GAN training for high fidelity natural image synthesis," CoRR, abs/1809.11096, 2018.Google ScholarGoogle Scholar
  26. A. e. a. Radford, "Learning transferable visual models from natural language supervision," arXiv:2103.00020, 2021.Google ScholarGoogle Scholar
  27. Z. Gangyan, L. Zhaohui and Z. Yuan, "PororoGAN: An Improved Story Visualization Model on Pororo-SV Dataset," in 3rd International Conference on Computer Science and Artificial Intelligence, Normal IL USA, 2019.Google ScholarGoogle Scholar
  28. A. Radford, J. Wu, R. Child, D. Luan, D. Amodei and I. Sutskever, "Language models are unsupervised multitask learners," OpenAI blog, 1(8), 9, 2019.Google ScholarGoogle Scholar
  29. W. Antoun, F. Baly and H. Hajj, "ARAGPT2: Pre-Trained Transformer for Arabic Language Generation," ArXiv:2012.15520v2, 2021.Google ScholarGoogle Scholar
  30. A. Ramesh, M. Pavlov, G. Goh, S. Gray, C. Voss, A. Radford, M. Chen and I. Sutskever, "Zero-Shot Text-to-Image Generation," ArXiv:2102.12092, 2021.Google ScholarGoogle Scholar

Index Terms

  1. A Generative Approach to Enrich Arabic Story Text with Visual Aids
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          ICSIE '21: Proceedings of the 10th International Conference on Software and Information Engineering
          November 2021
          62 pages
          ISBN:9781450384315
          DOI:10.1145/3512716

          Copyright © 2021 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 7 March 2022

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed limited

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format