HanaNLG: A Flexible Hybrid Approach for Natural Language Generation

Barros, Cristina; Lloret, Elena

doi:10.1007/978-3-031-24340-0_38

Cristina Barros⁸ &
Elena Lloret⁸

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13452))

Included in the following conference series:

International Conference on Computational Linguistics and Intelligent Text Processing

367 Accesses

Abstract

Nowadays, with advances in digital technologies, interaction between computers and humans is essential. In this regard, the area of Natural Language Generation (NLG) can provide techniques capable of facilitating and improving this type of interaction. However, the existing approaches to this field are usually developed ad-hoc for specific tasks, purposes and domains, which hinders the advancement of flexible and adaptable multi-domain NLG systems. Under these premises, the objective of this paper is to present HanaNLG, a hybrid generic NLG approach, focused on the surface realisation stage. HanaNLG combines statistic and knowledge-based techniques and is able to generate text independently of the domain. In particular, this is done by exploiting language models in conjunction with semantic knowledge, providing flexibility to the whole generation process, thus, minimising the high cost associated with the development of common elements involved in NLG, such as grammars. Therefore, taking into account this joint perspective, our approach contributes to advancing the NLG field by providing greater flexibility when it comes to (i) producing text for different domains, and (ii) increasing the variety of vocabulary to appear in the generated text. In order to assess the effectiveness of HanaNLG, it was tested in two domains: (i) NLG for assistive technologies and, (ii) NLG for creating opinionated sentences. The positive results obtained (almost the 99% of the generated sentences for both domains are original and well constructed) show that our approach is capable of generating text for different domains. More importantly, the combination of language models with semantic knowledge enhances the quality of the generated text, thereby improving the results obtained compared to other methods that only rely on statistical methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://openccg.sourceforge.net/.
2.
A set of synonyms used in Wordnet that are related to a term.
3.
http://projects.csail.mit.edu/jverbnet/.
4.
https://freestoriesforkids.com/.
5.
http://hca.gilead.org.il/.

References

Cambria, E., White, B.: Jumping NLP curves: a review of natural language processing research. Comput. Intell. Mag. IEEE 9, 48–57 (2014)
Article Google Scholar
Reiter, E., Dale, R.: Building Natural Language Generation Systems. Cambridge University Press, Cambridge (2000)
Google Scholar
Wanner, L., Bohnet, B., Bouayad-Agha, N., Lareau, F., Nicklaß, D.: Marquis: generation of user-tailored multilingual air quality bulletins. Appl. Artif. Intell. 24, 914–952 (2010)
Article Google Scholar
McDonald, D.D.: 6. In: Natural Language Generation, pp. 121–144. CRC Press (2010)
Google Scholar
Barros, C., Lloret, E.: A multilingual multi-domain data-to-text natural language generation approach. Procesamiento del Lenguaje Nat. 58, 45–52 (2017)
Google Scholar
Mairesse, F., Young, S.: Stochastic language generation in dialogue using factored language models. Comput. Linguist. 40, 763–799 (2014)
Article Google Scholar
Bangalore, S., Rambow, O.: Exploiting a probabilistic hierarchical model for generation. In: Proceedings of the 18th Conference on Computational Linguistics, vol. 1, COLING 2000, pp. 42–48. Association for Computational Linguistics (2000)
Google Scholar
Group, X.R.: A lexicalized tree adjoining grammar for English. Technical report IRCS-01-03, IRCS, University of Pennsylvania (2001)
Google Scholar
White, M.A.J., Clark, R.D., Moore, J.: Generating tailored comparative descriptions with contextually appropriate intonation. Comput. Linguist. 36, 159–201 (2010)
Article Google Scholar
Kondadadi, R., Howald, B., Schilder, F.: A statistical NLG framework for aggregated planning and realization. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, (Volume 1: Long Papers), pp. 1406–1415. Association for Computational Linguistics (2013)
Google Scholar
Mille, S., Ballesteros, M., Burga, A., Casamayor, G., Wanner, L.: Multilingual natural language generation within abstractive summarization. In: Proceedings of the 1st International Workshop on Multimodal Media Data Analytics co-located with the 22nd European Conference on Artificial Intelligence, MMDA@ECAI 2016, pp. 33–38 (2016)
Google Scholar
Žolkovskij, A.K., Mel’čuk, I.A.: O vozmožnom metode i instrumentax semantičeskogo sinteza. Naučno-texničeskaja informacija (1965)
Google Scholar
Gardent, C., Perez-Beltrachini, L.: A statistical, grammar-based approach to microplanning. Comput. Linguist. 43, 1–30 (2017)
Article Google Scholar
García-Méndez, S., Fernández-Gavilanes, M., Costa-Montenegro, E., Juncal-Martínez, J., González-Castaño, F.J.: Automatic natural language generation applied to alternative and augmentative communication for online video content services using simpleNLG for Spanish. In: Proceedings of the Internet of Accessible Things. W4A 2018, pp. 19:1–19:4. ACM (2018)
Google Scholar
Gatt, A., Reiter, E.: SimpleNLG: A realisation engine for practical applications. In: Proceedings of the 12th European Workshop on Natural Language Generation, pp. 90–93. Association for Computational Linguistics (2009)
Google Scholar
Fellbaum, C.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)
Google Scholar
Schuler, K.K.: Verbnet: A Broad-coverage, Comprehensive Verb Lexicon. Ph.D. thesis (2005)
Google Scholar
Isard, A., Brockmann, C., Oberlander, J.: Individuality and alignment in generated dialogues. In: Proceedings of the INLG, pp. 25–32. Association for Computational Linguistics (2006)
Google Scholar
Padró, L., Stanilovsky, E.: Freeling 3.0: Towards wider multilinguality. In: Proceedings of the 8ht International Conference on Language Resources and Evaluation, European Language Resources Association (2012)
Google Scholar
Stolcke, A.: SRILM - an extensible language modeling toolkit. Proc. Int. Conf. Spoken Lang. Process. 2, 901–904 (2002)
Google Scholar
Finlayson, M.A.: Java libraries for accessing the Princeton wordnet: comparison and evaluation. In: Proceedings of the 7th International Global WordNet Conference (GWC 2014), Tartu, Estonia, pp.78–85. Global WordNet Association (2014)
Google Scholar
Rvachew, S., Rafaat, S., Martin, M.: Stimulability, speech perception skills, and the treatment of phonological disorders. Am. J. Speech-Lang. Pathol. 8, 33–43 (1999)
Article Google Scholar
Lobo, P.V., de Matos, D.M.: Fairy tale corpus organization using latent semantic mapping and an item-to-item top-n recommendation algorithm. In: Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC 2010), European Languages Resources Association (ELRA) (2010)
Google Scholar
Pang, B., Lee, L.: A sentimental education: Sentiment analysis using subjectivity. In: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, pp. 271–278. Association for Computational Linguistics (2004)
Google Scholar
Gkatzia, D., Mahamood, S.: A snapshot of NLG evaluation practices 2005–2014. In: Proceedings of the 15th European Workshop on Natural Language Generation (ENLG), pp. 57–60. Association for Computational Linguistics (2015)
Google Scholar
Randolph, J.J.: Online kappa calculator [computer software] (2008). http://justus.randolph.name/kappa

Download references

Acknowledgment

This research has been partially funded by the Generalitat Valenciana through the project “SIIA: Tecnologías del lenguaje humano para una sociedad inclusiva, igualitaria, y accesible” (PROMETEU/2018/089).

Author information

Authors and Affiliations

Department of Software and Computing Systems, University of Alicante, Apdo. de Correos 99, E-03080, Alicante, Spain
Cristina Barros & Elena Lloret

Authors

Cristina Barros
View author publications
You can also search for this author in PubMed Google Scholar
Elena Lloret
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cristina Barros .

Editor information

Editors and Affiliations

Instituto Politécnico Nacional, Mexico City, Mexico
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Barros, C., Lloret, E. (2023). HanaNLG: A Flexible Hybrid Approach for Natural Language Generation. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2019. Lecture Notes in Computer Science, vol 13452. Springer, Cham. https://doi.org/10.1007/978-3-031-24340-0_38

Download citation

DOI: https://doi.org/10.1007/978-3-031-24340-0_38
Published: 26 February 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-24339-4
Online ISBN: 978-3-031-24340-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

HanaNLG: A Flexible Hybrid Approach for Natural Language Generation