Abstract
In this paper, based on the recent outcome of two shared tasks on structured data verbalisation, and examining one system in particular, we present some evidence why grammar-based systems are particularly relevant for the verbalisation of structured data as found in the Semantic Web. We then define possible future lines of research, centered around the FORGe system and the linguistic grounding of Semantic Web datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
This information appears in the infobox of the corresponding Wikipedia page: https://en.wikipedia.org/wiki/Arr%C3%B2s_negre.
- 6.
- 7.
One of the first papers mentioning multiple datasets was published in 2012 [10].
- 8.
It took about two hours to adapt FORGe to a hundred new DBpedia properties.
- 9.
See also [6] for an overview of models to represent linked data and their issues.
References
Androutsopoulos, I., Lampouras, G., Galanis, D.: Generating natural language descriptions from OWL ontologies: the naturalowl system. J. Artif. Intell. Res. 48, 671–715 (2013)
Banerjee, S., Lavie, A.: METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, pp. 65–72 (2005)
Belz, A.: Automatic generation of weather forecast texts using comprehensive probabilistic generation-space models. J. Nat. Lang. Eng. 14(4), 431–455 (2008)
Belz, A., White, M., Espinosa, D., Kow, E., Hogan, D., Stent, A.: The first surface realisation shared task: overview and evaluation results. In: Proceedings of the Generation Challenges Session at the 13th European Workshop on Natural Language Generation (ENLG), Nancy, France, pp. 217–226 (2011)
Bontcheva, K., Wilks, Y.: Automatic report generation from ontologies: The MIAKT approach. In: Meziane, F., Métais, E. (eds.) NLDB 2004. LNCS, vol. 3136, pp. 324–335. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-27779-8_28
Bosque-Gil, J., Gracia, J., Montiel-Ponsoda, E., Gómez-Pérez, A.: Models to represent linguistic linked data. Nat. Lang. Eng. 24(6), 811–859 (2018)
Bouayad-Agha, N., Casamayor, G., Mille, S., Wanner, L.: Perspective-oriented generation of football match summaries: old tasks, new challenges. ACM Trans. Speech Lang. Process. 9(2), 3:1–3:31 (2012)
Bouayad-Agha, N., Casamayor, G., Wanner, L.: Natural language generation in the context of the semantic web. Semant. Web 5(6), 493–513 (2014)
Corcoglioniti, F., Rospocher, M., Aprosio, A.P., Tonelli, S.: PreMON: a lemon extension for exposing predicate models as linked data. In: Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC), pp. 877–884 (2016)
Dannélls, D., Damova, M., Enache, R., Chechev, M.: Multilingual online generation from semantic web ontologies. In: Proceedings of the 21st International Conference on World Wide Web, pp. 239–242. ACM (2012)
Elder, H., Gehrmann, S., O’Connor, A., Liu, Q.: E2E NLG challenge submission: towards controllable generation of diverse natural language. In: Proceedings of the 11th International Conference on Natural Language Generation, pp. 457–462 (2018)
Fillmore, C.J., Baker, C.F., Sato, H.: The FrameNet database and software tools. In: Proceedings of the 3rd International Conference on Language Resources and Evaluation (LREC), Las Palmas, Canary Islands, Spain, pp. 1157–1160 (2002)
Galanis, D., Androutsopoulos, I.: Generating multilingual descriptions from linguistically annotated OWL ontologies: the naturalowl system. In: Proceedings of the Eleventh European Workshop on Natural Language Generation, pp. 143–146. Association for Computational Linguistics (2007)
Gardent, C., Shimorina, A., Narayan, S., Perez-Beltrachini, L.: Creating training corpora for micro-planners. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Vancouver, Canada, August 2017
Gardent, C., Shimorina, A., Narayan, S., Perez-Beltrachini, L.: The WebNLG challenge: generating text from RDF data. In: Proceedings of the 10th International Conference on Natural Language Generation, pp. 124–133 (2017)
Gatt, A., Krahmer, E.: Survey of the state of the art in natural language generation: core tasks, applications and evaluation. J. Artif. Intell. Res. 61, 65–170 (2018)
Höffner, K., Walter, S., Marx, E., Usbeck, R., Lehmann, J., Ngonga Ngomo, A.C.: Survey on challenges of question answering in the semantic web. Semant. Web 8(6), 895–920 (2017)
Kingsbury, P., Palmer, M.: From TreeBank to PropBank. In: Proceedings of the 3rd International Conference on Language Resources and Evaluation (LREC), Las Palmas, Canary Islands, Spain, pp. 1989–1993 (2002)
Kwok, C., Etzioni, O., Weld, D.S.: Scaling question answering to the web. ACM Trans. Inf. Syst. (TOIS) 19(3), 242–262 (2001)
Lareau, F., Lambrey, F., Dubinskaite, I., Galarreta-Piquette, D., Nejat, M.: GenDR: a generic deep realizer with complex lexicalization. In: Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC), Miyazaki, Japan, pp. 3018–3025 (2018)
Mel’čuk, I.: Dependency Syntax: Theory and Practice. State University of New York Press, Albany (1988)
Meyers, A., et al.: The NomBank project: an interim report. In: Proceedings of the Workshop on Frontiers in Corpus Annotation, Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLT/NAACL), Boston, MA, USA, pp. 24–31 (2004)
Mille, S., Belz, A., Bohnet, B., Graham, Y., Pitler, E., Wanner, L.: The first multilingual surface realisation shared task (SR 2018): overview and evaluation results. In: Proceedings of the 1st Workshop on Multilingual Surface Realisation (MSR), 56th Annual Meeting of the Association for Computational Linguistics (ACL), Melbourne, Australia, pp. 1–12 (2018)
Mille, S., Carlini, R., Burga, A., Wanner, L.: FORGe at SemEval-2017 task 9: deep sentence generation based on a sequence of graph transducers. In: Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), Vancouver, Canada, pp. 917–920. Association for Computational Linguistics, August 2017. http://www.aclweb.org/anthology/S17-2158
Mille, S., Wanner, L.: Towards large-coverage detailed lexical resources for data-to-text generation. In: Proceedings of the First International Workshop on Data-to-text Generation, Edinburgh, Scotland (2015)
Nayak, N., Hakkani-Tür, D., Walker, M.A., Heck, L.P.: To plan or not to plan? discourse planning in slot-value informed sequence to sequence models for language generation. In: Proceedings of INTERSPEECH, Stockholm, Sweden, pp. 3339–3343 (2017)
Novikova, J., Dušek, O., Rieser, V.: The E2E dataset: new challenges for end-to-end generation. In: Proceedings of the 18th Annual Meeting of the Special Interest Group on Discourse and Dialogue, Saarbrücken, Germany (2017). https://arxiv.org/abs/1706.09254, arXiv:1706.09254
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 311–318. Association for Computational Linguistics (2002)
Perez-Beltrachini, L., Gardent, C.: Learning embeddings to lexicalise RDF properties. In: * SEM 2016, The Fifth Joint Conference on Lexical and Computational Semantics, pp. 219–228 (2016)
Rambow, O., Korelsky, T.: Applied text generation. In: Proceedings of the 3rd Conference on Applied Natural Language Processing (ANLP), Trento, Italy, pp. 40–47 (1992)
Schuler, K.K.: VerbNet: a broad-coverage, comprehensive verb lexicon. Ph.D. thesis, University of Pennsylvania (2005)
Shimorina, A., Gardent, C., Narayan, S., Perez-Beltrachini, L.: The WebNLG challenge: report on human evaluation. Technical report, Université de Lorraine, Nancy, France (2017)
Stevens, R., Malone, J., Williams, S., Power, R., Third, A.: Automating generation of textual class definitions from OWL to English. J. Biomed. Semant. 2, S5 (2011). BioMed Central
Walter, S., Unger, C., Cimiano, P.: M-ATOLL: a framework for the lexicalization of ontologies in multiple languages. In: Mika, P., et al. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 472–486. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11964-9_30
Wanner, L., Bohnet, B., Bouayad-Agha, N., Lareau, F., Nicklaß, D.: MARQUIS: generation of user-tailored multilingual air quality bulletins. Appl. Artif. Intell. 24(10), 914–952 (2010)
Acknowledgements
The work reported in this paper has been partly supported by the European Commission in the framework of the H2020 Programme under the contract numbers 700475-IA, 700024-RIA, 779962-RIA, 786731-RIA and 825079-ICT-STARTS.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Mille, S. (2019). Selected Challenges in Grammar-Based Text Generation from the Semantic Web. In: Osipov, G., Panov, A., Yakovlev, K. (eds) Artificial Intelligence. Lecture Notes in Computer Science(), vol 11866. Springer, Cham. https://doi.org/10.1007/978-3-030-33274-7_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-33274-7_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33273-0
Online ISBN: 978-3-030-33274-7
eBook Packages: Computer ScienceComputer Science (R0)