Abstract
Text Generation, especially poetry synthesis, is a promising and challenging AI task. We have used LSTM and word2vec methods to explore this area. We do forward and backward word training with different word sequences lengths. Two Datasets of Arabic poems were used. Preprocessing for unification and cleaning was done too, but the Data size was big and required very high memory and processing, so we used a sub-Datasets for training; this affected our experiments since the model is trained on fewer data. A user-supplied keyword was implemented. We have found the shorter training sequence models were better in generating more meaningful text, and longer models prefer most frequent words, repeat text, and use small words. Best predicted sentences were selected by measuring each of its words conditional probability and multiply them; this avoids local maxima if we used a greedy method that chooses the best next-word only. Moreover, the AraVecword2vec module was not very helpful since it was provided synonyms much more that related words. Many enhancements can be done in the future, such as Arabic prosody constraints, and overcome the hardware issue.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Wahdan, K.S.A., Hantoobi, S., Salloum, S.A., Shaalan, K.: A systematic review of text classification research based ondeep learning models in Arabic language. Int. J. Electr. Comput. Eng 10(6), 6629–6643 (2020)
Alomari, K.M., Alhamad, A.Q., Mbaidin, H.O., Salloum, S.: Prediction of the digital game rating systems based on the ESRB. Opcion 35(19), 1368–1393 (2019)
Salloum, S.A., Khan, R., Shaalan, K.: A survey of semantic analysis approaches. In: Joint European-US Workshop on Applications of Invariance in Computer Vision, pp. 61–70 (2020)
Salloum, S.A., Alshurideh, M., Elnagar, A., Shaalan, K.: Machine learning and deep learning techniques for cybersecurity: a review. In: Joint European-US Workshop on Applications of Invariance in Computer Vision, pp. 50–57 (2020)
Salloum, S.A., Alshurideh, M., Elnagar, A., Shaalan, K.: Mining in educational data: review and future directions. In: Joint European-US Workshop on Applications of Invariance in Computer Vision, pp. 92–102 (2020)
Li, J., et al: Generating classical chinese poems via conditional variational autoencoder and adversarial training. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 3890–3900 (2018)
Ghazvininejad, M., Shi, X., Priyadarshi, J., Knight, K.: Hafez: an interactive poetry generation system. In: Proceedings of ACL 2017, System Demonstrations, pp. 43–48 (2017)
Ghazvininejad, M., Shi, X., Choi, Y., Knight, K.: Generating topical poetry. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 1183–1191 (2016)
Cheng, W.-F., Wu, C.-C., Song, R., Fu, J., Xie, X., Nie, J.-Y.: Image inspired poetry generation in xiaoice. arXiv Prepr. arXiv1808.03090 (2018)
Loller Andersen Malte, G.B.: Deep learning-based poetry generation given visual input. In: ICCC, pp. 240–247 (2018)
Wang, Z., et al.: Chinese poetry generation with planning based neural network, arXiv Prepr. arXiv1610.09889 (2016)
Yi, X., Sun, M., Li, R., Li, W.: Automatic poetry generation with mutual reinforcement learning. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 3143–3153 (2018)
Clark, E., Ross, A.S., Tan, C., Ji, Y., Smith, N.A.: Creative writing with a machine in the loop: case studies on slogans and stories. In: 23rd International Conference on Intelligent User Interfaces, pp. 329–340 (2018)
Soliman, A.B., Eissa, K., El-Beltagy, S.R.: Aravec: a set of arabic word embedding models for use in arabic nlp. Procedia Comput. Sci. 117, 256–265 (2017)
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
Clark, E., Ji, Y., Smith, N.A.: Neural text generation in stories using entity representations as context. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 2250–2260 (2018)
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104–3112 (2014)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space, arXiv Prepr. arXiv1301.3781 (2013)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Acknowledgment
This work is a part of a project undertaken at the British University in Dubai.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Hejazi, H.D., Khamees, A.A., Alshurideh, M., Salloum, S.A. (2021). Arabic Text Generation: Deep Learning for Poetry Synthesis. In: Hassanien, AE., Chang, KC., Mincong, T. (eds) Advanced Machine Learning Technologies and Applications. AMLTA 2021. Advances in Intelligent Systems and Computing, vol 1339. Springer, Cham. https://doi.org/10.1007/978-3-030-69717-4_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-69717-4_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-69716-7
Online ISBN: 978-3-030-69717-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)