The Role of Linked Data in Content Selection

Perera, Rivindu; Nand, Parma

doi:10.1007/978-3-319-13560-1_46

Rivindu Perera²¹ &
Parma Nand²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8862))

Included in the following conference series:

Pacific Rim International Conference on Artificial Intelligence

6568 Accesses

Abstract

This paper explores the appropriateness of utilizing Linked Data as a knowledge source for content selection. Content Selection is a crucial subtask in Natural Language Generation which has the function of determining the relevancy of contents from a knowledge source based on a communicative goal. The recent online era has enabled us to accumulate extensive amounts of generic online knowledge some of which has been made available as structured knowledge sources for computational natural language processing purposes. This paper proposes a model for content selection by utilizing a generic structured knowledge source, DBpedia, which is a replica of the unstructured counterpart, Wikipedia. The proposed model uses log likelihood to rank the contents from DBpedia Linked Data for relevance to a communicative goal. We performed experiments using DBpedia as the Linked Data resource using two keyword datasets as communicative goals. To optimize parameters we used keywords extracted from QALD-2 training dataset and QALD-2 testing dataset is used for the testing. The results was evaluated against the verbatim based selection strategy. The results showed that our model can perform 18.03% better than verbatim selection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

C-Rank: A Concept Linking Approach to Unsupervised Keyphrase Extraction

The interactive Leipzig Corpus Miner: An extensible and adaptable text analysis tool for content analysis

Article Open access 29 August 2023

A Content Management System for Chatbots

References

Reiter, E., Dale, R.: Building natural language generation systems. Cambridge University Press (January 2000)
Google Scholar
Jentzsch, A., Cyganiak, R., Bizer, C.: State of the LOD Cloud. Technical report, Hasso-Plattner-Institute, Potsdam-Babelsberg (2011)
Google Scholar
Rayson, P., Berridge, D., Francis, B.: Extending the Cochran rule for the comparison of word frequencies between corpora. In: 7th International Conference on Statistical Analysis of Textual Data (2004)
Google Scholar
He, T., Zhang, X., Xinghuo, Y.: An Approach to Automatically Constructing Domain Ontology. In: Pacific Asia Computational Linguistics, Wuhan, pp. 150–157 (2006)
Google Scholar
Gelbukh, A., Sidorov, G., Lavin-Villa, E., Chanona-Hernandez, L.: Automatic Term Extraction Using Log-Likelihood Based Comparison with General Reference Corpus. In: Hopfe, C.J., Rezgui, Y., Métais, E., Preece, A., Li, H. (eds.) NLDB 2010. LNCS, vol. 6177, pp. 248–255. Springer, Heidelberg (2010)
Chapter Google Scholar
Pedersen, P.: WordNet: Similarity - Measuring the Relatedness of Concepts. In: Fifth Annual Meeting of the North American Chapter of the Association for Computational Linguistics, Boston, pp. 38–41 (2004)
Google Scholar
Miller, G.A.: WordNet: A Lexical Database for English. Communications of the ACM 38(11), 39–41 (1995)
Article Google Scholar
Penas, A., Hovy, E.: Semantic enrichment of text with background knowledge. In: NAACL HLT 2010 First International Workshop on Formalisms and Methodology for Learning by Reading, Los Angeles, pp. 15–23. Association for Computational Linguistics (June 2010)
Google Scholar
Voorhees, E., Tice, D.: Building a Question Answering Test Collection. In: ACM Special Interest Group on Information Retrieval Conference, Athens, Greece. ACM Press (2000)
Google Scholar
Unger, C.: Question Answering Over Linked Data. Technical report, Bielefeld University, Heraklion, Greece (2012)
Google Scholar
Smith, N., Heilman, M., Hwa, R., Cohen, S., Gimpel, K.: Question-Answer Dataset. Technical report, Carnegie Mellon University, Pennsylvania, USA (2013)
Google Scholar
Bouayad-Agha, N., Casamayor, G., Wanner, L., Mellish, C.: Content selection from semantic web data. In: Seventh International Natural Language Generation Conference, Utica, IL, USA, pp. 146–149. Association for Computational Linguistics (May 2012)
Google Scholar
Bouayad-Agha, N., Casamayor, G., Wanner, L., Mellish, C.: Overview of the First Content Selection Challenge from Open Semantic Web Data. In: Proceedings of the 14th European Workshop on Natural Language Generation, Sofia, Bulgaria, pp. 98–102. Association for Computational Linguistics (August 2013)
Google Scholar
Kutlak, R., Mellish, C., van Deemter, K.: Content Selection Challenge - University of Aberdeen Entry. In: Fourteenth European Workshop on Natural Language Generation, Sofia, Bulgaria, pp. 208–209. Association for Computational Linguistics (August 2013)
Google Scholar
Venigalla, H., Eugenio, B.D.: UIC-CSC: The Content Selection Challenge Entry from the University of Illinois at Chicago. In: Proceedings of the 14th European Workshop on Natural Language Generation, Sofia, Bulgaria, pp. 210–211. Association for Computational Linguistics (August 2013)
Google Scholar
Duboue, P.A., McKeown, K.R.: Statistical acquisition of content selection rules for natural language generation. In: Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, Morristown, NJ, USA, vol. 10, pp. 121–128. Association for Computational Linguistics (July 2003)
Google Scholar
Bouayad-Agha, N., Casamayor, G., Wanner, L.: Content selection from an ontology-based knowledge base for the generation of football summaries. In: Thirtheenth European Workshop on Natural Language Generation, Nancy, France, pp. 72–81. Association for Computational Linguistics (September 2011)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer and Mathematical Sciences, Auckland University of Technology, Auckland, 1010, New Zealand
Rivindu Perera & Parma Nand

Authors

Rivindu Perera
View author publications
You can also search for this author in PubMed Google Scholar
Parma Nand
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

MIMOS Berhad Technology Park Malaysia, 57000, Bukit Jalil, KL, Malaysia
Duc-Nghia Pham
Kyungpook National University, Sankyuk-Dong, Buk-Gu, 702-701, Daegu, Korea
Seong-Bae Park

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Perera, R., Nand, P. (2014). The Role of Linked Data in Content Selection. In: Pham, DN., Park, SB. (eds) PRICAI 2014: Trends in Artificial Intelligence. PRICAI 2014. Lecture Notes in Computer Science(), vol 8862. Springer, Cham. https://doi.org/10.1007/978-3-319-13560-1_46

Download citation

DOI: https://doi.org/10.1007/978-3-319-13560-1_46
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13559-5
Online ISBN: 978-3-319-13560-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics