Abstract
Texts generated in data-to-text generation tasks often have repetitive parts. In order to get higher quality generated texts, we choose a data-to-text generation model with content planning, and add coverage mechanisms to both the content planning and text generation stages. In the content planning stage, a coverage mechanism is introduced to remove duplicate content templates, so as to remove sentences with the same semantics in the generated texts. In the text generation stage, the coverage mechanism is added to remove the repeated words in the texts. In addition, in order to embed the positional association information in the data into the word vectors, we also add positional encoding to the word embedding. Then the word vectors are fed to the pointer network to generate content template. Finally, the content template is inputted into the text generator to generate the descriptive texts. Through experiments, the accuracy of the content planning and the BLEU of the generated texts have been improved, which verifies the effectiveness of our proposed data-to-text generation model.
The National Natural Science Foundation of China (61371196), National Science and Technology Major Project (2015ZX01040-201).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Wiseman, S., Shieber, S.M., Rush, A.M.: Challenges in data-to-document generation. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 2253–2263 (2017)
Puduppully, R., Dong, L., Lapata, M.: Data-to-text generation with content selection and planning. In: Proceedings of the 33rd AAAI Conference on Artificial Intelligence, pp. 6908–6915 (2019)
Peng, L., Liu, Q., Lv, L., Deng, W., Wang, C.: An abstractive summarization method based on global gated dual encoder. In: Zhu, X., Zhang, M., Hong, Y., He, R. (eds.) NLPCC 2020. LNCS (LNAI), vol. 12431, pp. 355–365. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-60457-8_29
Bengio, Y., Ducharme, R., Vincent, P., et al.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)
Mikolov, T., Karafiát, M., Burget, L., et al.: Recurrent neural network based language model. In: Proceedings of the 11th Annual Conference of the International Speech Communication Association, pp. 1045–1048 (2010)
Sutskever, I., Martens, J., Hinton, G.: Generating text with recurrent neural networks. In: Proceedings of the 28th International Conference on International Conference on Machine Learning, pp. 1017–1024 (2011)
Lebret, R., Grangier, D., Auli, M.: Neural text generation from structured data with application to the biography domain. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 1203–1213 (2016)
Gong, H., Wei, B., Xiaocheng, F., et al.: Enhancing content planning for table-to-text generation with data understanding and verification. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, pp. 2905–2914 (2020)
Chen, K., Li, F., Hu, B., et al.: Neural data-to-text generation with dynamic content planning. Knowl. Based Syst. 215, 106610 (2021)
Xiaohong, X., Ting, H., Huazhen, W., et al.: Research on data-to-text generation based on transformer model and deep neural network. J. Chongqing Univ. 43(07), 91–100 (2020)
Distiawan, B.T., Jianzhong, Q., Rui, Z.: Sentence generation for entity description with content-plan attention. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence, pp. 9057–9064 (2020)
Zhaopeng, T., Zhengdong, L., Yang, L., et al.: Modeling coverage for neural machine translation. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pp. 76–85 (2016)
Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Abigail, S., Peter, J.L., Christopher, D.M.: Get to the point: summarization with pointer-generator networks. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, pp. 1073–1083 (2017)
Klein, G., Kim, Y., Yuntian, D., et al.: OpenNMT: open-source toolkit for neural machine translation. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics-System Demonstrations, pp. 67–72 (2017)
Rebuffel, C., Soulier, L., Scoutheeten, G., Gallinari, P.: A hierarchical model for data-to-text generation. In: Jose, J.M., et al. (eds.) ECIR 2020. LNCS, vol. 12035, pp. 65–80. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-45439-5_5
Papineni, K., Roukos, S., Ward, T., et al.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, pp. 311–318 (2002)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Wang, M., Cao, J., Yu, X., Nie, Z. (2022). A Data-to-Text Generation Model with Deduplicated Content Planning. In: Li, T., et al. Big Data. BigData 2022. Communications in Computer and Information Science, vol 1709. Springer, Singapore. https://doi.org/10.1007/978-981-19-8331-3_6
Download citation
DOI: https://doi.org/10.1007/978-981-19-8331-3_6
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-8330-6
Online ISBN: 978-981-19-8331-3
eBook Packages: Computer ScienceComputer Science (R0)