Skip to main content

HanaNLG: A Flexible Hybrid Approach for Natural Language Generation

  • Conference paper
  • First Online:
Computational Linguistics and Intelligent Text Processing (CICLing 2019)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13452))

  • 367 Accesses

Abstract

Nowadays, with advances in digital technologies, interaction between computers and humans is essential. In this regard, the area of Natural Language Generation (NLG) can provide techniques capable of facilitating and improving this type of interaction. However, the existing approaches to this field are usually developed ad-hoc for specific tasks, purposes and domains, which hinders the advancement of flexible and adaptable multi-domain NLG systems. Under these premises, the objective of this paper is to present HanaNLG, a hybrid generic NLG approach, focused on the surface realisation stage. HanaNLG combines statistic and knowledge-based techniques and is able to generate text independently of the domain. In particular, this is done by exploiting language models in conjunction with semantic knowledge, providing flexibility to the whole generation process, thus, minimising the high cost associated with the development of common elements involved in NLG, such as grammars. Therefore, taking into account this joint perspective, our approach contributes to advancing the NLG field by providing greater flexibility when it comes to (i) producing text for different domains, and (ii) increasing the variety of vocabulary to appear in the generated text. In order to assess the effectiveness of HanaNLG, it was tested in two domains: (i) NLG for assistive technologies and, (ii) NLG for creating opinionated sentences. The positive results obtained (almost the 99% of the generated sentences for both domains are original and well constructed) show that our approach is capable of generating text for different domains. More importantly, the combination of language models with semantic knowledge enhances the quality of the generated text, thereby improving the results obtained compared to other methods that only rely on statistical methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://openccg.sourceforge.net/.

  2. 2.

    A set of synonyms used in Wordnet that are related to a term.

  3. 3.

    http://projects.csail.mit.edu/jverbnet/.

  4. 4.

    https://freestoriesforkids.com/.

  5. 5.

    http://hca.gilead.org.il/.

References

  1. Cambria, E., White, B.: Jumping NLP curves: a review of natural language processing research. Comput. Intell. Mag. IEEE 9, 48–57 (2014)

    Article  Google Scholar 

  2. Reiter, E., Dale, R.: Building Natural Language Generation Systems. Cambridge University Press, Cambridge (2000)

    Google Scholar 

  3. Wanner, L., Bohnet, B., Bouayad-Agha, N., Lareau, F., Nicklaß, D.: Marquis: generation of user-tailored multilingual air quality bulletins. Appl. Artif. Intell. 24, 914–952 (2010)

    Article  Google Scholar 

  4. McDonald, D.D.: 6. In: Natural Language Generation, pp. 121–144. CRC Press (2010)

    Google Scholar 

  5. Barros, C., Lloret, E.: A multilingual multi-domain data-to-text natural language generation approach. Procesamiento del Lenguaje Nat. 58, 45–52 (2017)

    Google Scholar 

  6. Mairesse, F., Young, S.: Stochastic language generation in dialogue using factored language models. Comput. Linguist. 40, 763–799 (2014)

    Article  Google Scholar 

  7. Bangalore, S., Rambow, O.: Exploiting a probabilistic hierarchical model for generation. In: Proceedings of the 18th Conference on Computational Linguistics, vol. 1, COLING 2000, pp. 42–48. Association for Computational Linguistics (2000)

    Google Scholar 

  8. Group, X.R.: A lexicalized tree adjoining grammar for English. Technical report IRCS-01-03, IRCS, University of Pennsylvania (2001)

    Google Scholar 

  9. White, M.A.J., Clark, R.D., Moore, J.: Generating tailored comparative descriptions with contextually appropriate intonation. Comput. Linguist. 36, 159–201 (2010)

    Article  Google Scholar 

  10. Kondadadi, R., Howald, B., Schilder, F.: A statistical NLG framework for aggregated planning and realization. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, (Volume 1: Long Papers), pp. 1406–1415. Association for Computational Linguistics (2013)

    Google Scholar 

  11. Mille, S., Ballesteros, M., Burga, A., Casamayor, G., Wanner, L.: Multilingual natural language generation within abstractive summarization. In: Proceedings of the 1st International Workshop on Multimodal Media Data Analytics co-located with the 22nd European Conference on Artificial Intelligence, MMDA@ECAI 2016, pp. 33–38 (2016)

    Google Scholar 

  12. Žolkovskij, A.K., Mel’čuk, I.A.: O vozmožnom metode i instrumentax semantičeskogo sinteza. Naučno-texničeskaja informacija (1965)

    Google Scholar 

  13. Gardent, C., Perez-Beltrachini, L.: A statistical, grammar-based approach to microplanning. Comput. Linguist. 43, 1–30 (2017)

    Article  Google Scholar 

  14. García-Méndez, S., Fernández-Gavilanes, M., Costa-Montenegro, E., Juncal-Martínez, J., González-Castaño, F.J.: Automatic natural language generation applied to alternative and augmentative communication for online video content services using simpleNLG for Spanish. In: Proceedings of the Internet of Accessible Things. W4A 2018, pp. 19:1–19:4. ACM (2018)

    Google Scholar 

  15. Gatt, A., Reiter, E.: SimpleNLG: A realisation engine for practical applications. In: Proceedings of the 12th European Workshop on Natural Language Generation, pp. 90–93. Association for Computational Linguistics (2009)

    Google Scholar 

  16. Fellbaum, C.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)

    Google Scholar 

  17. Schuler, K.K.: Verbnet: A Broad-coverage, Comprehensive Verb Lexicon. Ph.D. thesis (2005)

    Google Scholar 

  18. Isard, A., Brockmann, C., Oberlander, J.: Individuality and alignment in generated dialogues. In: Proceedings of the INLG, pp. 25–32. Association for Computational Linguistics (2006)

    Google Scholar 

  19. Padró, L., Stanilovsky, E.: Freeling 3.0: Towards wider multilinguality. In: Proceedings of the 8ht International Conference on Language Resources and Evaluation, European Language Resources Association (2012)

    Google Scholar 

  20. Stolcke, A.: SRILM - an extensible language modeling toolkit. Proc. Int. Conf. Spoken Lang. Process. 2, 901–904 (2002)

    Google Scholar 

  21. Finlayson, M.A.: Java libraries for accessing the Princeton wordnet: comparison and evaluation. In: Proceedings of the 7th International Global WordNet Conference (GWC 2014), Tartu, Estonia, pp.78–85. Global WordNet Association (2014)

    Google Scholar 

  22. Rvachew, S., Rafaat, S., Martin, M.: Stimulability, speech perception skills, and the treatment of phonological disorders. Am. J. Speech-Lang. Pathol. 8, 33–43 (1999)

    Article  Google Scholar 

  23. Lobo, P.V., de Matos, D.M.: Fairy tale corpus organization using latent semantic mapping and an item-to-item top-n recommendation algorithm. In: Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC 2010), European Languages Resources Association (ELRA) (2010)

    Google Scholar 

  24. Pang, B., Lee, L.: A sentimental education: Sentiment analysis using subjectivity. In: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, pp. 271–278. Association for Computational Linguistics (2004)

    Google Scholar 

  25. Gkatzia, D., Mahamood, S.: A snapshot of NLG evaluation practices 2005–2014. In: Proceedings of the 15th European Workshop on Natural Language Generation (ENLG), pp. 57–60. Association for Computational Linguistics (2015)

    Google Scholar 

  26. Randolph, J.J.: Online kappa calculator [computer software] (2008). http://justus.randolph.name/kappa

Download references

Acknowledgment

This research has been partially funded by the Generalitat Valenciana through the project “SIIA: Tecnologías del lenguaje humano para una sociedad inclusiva, igualitaria, y accesible” (PROMETEU/2018/089).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cristina Barros .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Barros, C., Lloret, E. (2023). HanaNLG: A Flexible Hybrid Approach for Natural Language Generation. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2019. Lecture Notes in Computer Science, vol 13452. Springer, Cham. https://doi.org/10.1007/978-3-031-24340-0_38

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-24340-0_38

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-24339-4

  • Online ISBN: 978-3-031-24340-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics