skip to main content
10.1145/2063518.2063519acmotherconferencesArticle/Chapter ViewAbstractPublication PagessemanticsConference Proceedingsconference-collections
research-article

DBpedia spotlight: shedding light on the web of documents

Published: 07 September 2011 Publication History

Abstract

Interlinking text documents with Linked Open Data enables the Web of Data to be used as background knowledge within document-oriented applications such as search and faceted browsing. As a step towards interconnecting the Web of Documents with the Web of Data, we developed DBpedia Spotlight, a system for automatically annotating text documents with DBpedia URIs. DBpedia Spotlight allows users to configure the annotations to their specific needs through the DBpedia Ontology and quality measures such as prominence, topical pertinence, contextual ambiguity and disambiguation confidence. We compare our approach with the state of the art in disambiguation, and evaluate our results in light of three baselines and six publicly available annotation systems, demonstrating the competitiveness of our system. DBpedia Spotlight is shared as open source and deployed as a Web Service freely available for public use.

References

[1]
A. V. Aho and M. J. Corasick. Efficient string matching: an aid to bibliographic search. Commun. ACM, 18:333--340, June 1975.
[2]
Alias-i. LingPipe 4.0.0. http://alias-i.com/lingpipe, retrieved on 24.08.2010, 2008.
[3]
C. Bizer, T. Heath, and T. Berners-Lee. Linked data - the story so far. Int. J. Semantic Web Inf. Syst., 5(3):1--22, 2009.
[4]
C. Bizer, J. Lehmann, G. Kobilarov, S. Auer, C. Becker, R. Cyganiak, and S. Hellmann. DBpedia - A crystallization point for the Web of Data. Web Semantics: Science, Services and Agents on the World Wide Web, 7:154--165, September 2009.
[5]
M. Buckland and F. Gey. The relationship between Recall and Precision. J. Am. Soc. Inf. Sci., 45(1):12--19, January 1994.
[6]
R. C. Bunescu and M. Pasca. Using encyclopedic knowledge for named entity disambiguation. In EACL, 2006.
[7]
S. Cucerzan. Large-scale named entity disambiguation based on wikipedia data. In EMNLP-CoNLL, pages 708--716, 2007.
[8]
G. de Melo and G. Weikum. Language as a foundation of the Semantic Web. In C. Bizer and A. Joshi, editors, Proceedings of the Poster and Demonstration Session at the 7th International Semantic Web Conference (ISWC 2008), volume 401 of CEUR WS, Karlsruhe, Germany, 2008. CEUR.
[9]
H. Deng, I. King, and M. R. Lyu. Entropy-biased models for query representation on the click graph. In SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, pages 339--346, New York, NY, USA, 2009. ACM.
[10]
S. Dill, N. Eiron, D. Gibson, D. Gruhl, R. Guha, A. Jhingran, T. Kanungo, S. Rajagopalan, A. Tomkins, J. A. Tomlin, and J. Y. Zien. Semtag and seeker: bootstrapping the semantic web via automated semantic annotation. In Proceedings of the 12th international conference on World Wide Web, WWW '03, pages 178--186, New York, NY, USA, 2003. ACM.
[11]
A. Fader, S. Soderland, and O. Etzioni. Scaling wikipedia-based named entity disambiguation to arbitrary web text. In Proceedings of the WikiAI 09 - IJCAI Workshop: User Contributed Knowledge and Artificial Intelligence: An Evolving Synergy, Pasadena, CA, USA, July 2009.
[12]
D. Gruhl, M. Nagarajan, J. Pieper, C. Robson, and A. P. Sheth. Context and domain knowledge enhanced entity spotting in informal text. In International Semantic Web Conference, pages 260--276, 2009.
[13]
R. V. Guha and R. McCool. Tap: A semantic web test-bed. J. Web Sem., 1(1):81--87, 2003.
[14]
J. Hassell, B. Aleman-Meza, and I. Arpinar. Ontology-driven automatic entity disambiguation in unstructured text. In I. Cruz, S. Decker, D. Allemang, C. Preist, D. Schwabe, P. Mika, M. Uschold, and L. Aroyo, editors, The Semantic Web - ISWC 2006, volume 4273 of Lecture Notes in Computer Science, pages 44--57. Springer Berlin/Heidelberg, 2006.
[15]
M. Hearst. UIs for Faceted Navigation: Recent Advances and Remaining Open Problems. In Workshop on Computer Interaction and Information Retrieval, HCIR, Redmond, WA, Oct. 2008.
[16]
K. S. Jones. A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 28:11--21, 1972.
[17]
S. Kulkarni, A. Singh, G. Ramakrishnan, and S. Chakrabarti. Collective annotation of wikipedia entities in web text. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '09. pages 457--466, New York, NY, USA, 2009. ACM.
[18]
P. N. Mendes, A. Passant, P. Kapanipathi, and A. P. Sheth. Linked open social signals. In Web Intelligence and Intelligent Agent Technology, 2010. WI-IAT '10. IEEE/WIC/ACM International Conference on, 2010.
[19]
R. Mihalcea and A. Csomai. Wikify!: linking documents to encyclopedic knowledge. In CIKM '07: Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, pages 233--242, New York, NY, USA, 2007. ACM.
[20]
D. Milne and I. H. Witten. Learning to link with wikipedia. In Proceeding of the 17th ACM conference on Information and knowledge management, CIKM '08, pages 509--518, New York, NY, USA, 2008. ACM.
[21]
M. Rowe. Applying semantic social graphs to disambiguate identity references. In L. Aroyo, P. Traverso, F. Ciravegna, P. Cimiano, T. Heath, E. Hyvönen, R. Mizoguchi, E. Oren, M. Sabou, and E. Simperl, editors, The Semantic Web: Research and Applications, volume 5554 of Lecture Notes in Computer Science, pages 461--475. Springer Berlin/Heidelberg, 2009.
[22]
G. Salton, A. Wong, and C. S. Yang. A vector space model for automatic indexing. Communications of the ACM, 18:613--620, November 1975.
[23]
C. E. Shannon. Prediction and entropy of printed english. Bell Systems Technical Journal, pages 50--64, 1951.
[24]
R. Volz, J. Kleb, and W. Mueller. Towards ontology-based disambiguation of geographical identifiers. In I3, 2007.

Cited By

View all
  • (2025)Knowledge Graphs for Representing Knowledge Progression of Students across Heterogeneous Learning SystemsInternational Journal of Artificial Intelligence in Education10.1007/s40593-024-00434-wOnline publication date: 9-Jan-2025
  • (2025)PARALLAX: Leveraging Polarization Knowledge for Misinformation DetectionSocial Networks Analysis and Mining10.1007/978-3-031-78541-2_6(86-105)Online publication date: 24-Jan-2025
  • (2024)The RDF2vec family of knowledge graph embedding methodsSemantic Web10.3233/SW-233514(1-32)Online publication date: 25-Jan-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
I-Semantics '11: Proceedings of the 7th International Conference on Semantic Systems
September 2011
129 pages
ISBN:9781450306218
DOI:10.1145/2063518
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 September 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. DBpedia
  2. linked data
  3. named entity disambiguation
  4. text annotation

Qualifiers

  • Research-article

Conference

I-Semantics '11

Acceptance Rates

Overall Acceptance Rate 40 of 182 submissions, 22%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)104
  • Downloads (Last 6 weeks)10
Reflects downloads up to 14 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Knowledge Graphs for Representing Knowledge Progression of Students across Heterogeneous Learning SystemsInternational Journal of Artificial Intelligence in Education10.1007/s40593-024-00434-wOnline publication date: 9-Jan-2025
  • (2025)PARALLAX: Leveraging Polarization Knowledge for Misinformation DetectionSocial Networks Analysis and Mining10.1007/978-3-031-78541-2_6(86-105)Online publication date: 24-Jan-2025
  • (2024)The RDF2vec family of knowledge graph embedding methodsSemantic Web10.3233/SW-233514(1-32)Online publication date: 25-Jan-2024
  • (2024)MuHeQA: Zero-shot question answering over multiple and heterogeneous knowledge basesSemantic Web10.3233/SW-23337915:5(1547-1561)Online publication date: 9-Oct-2024
  • (2024)ADOxx: Eine Low-Code-Plattform für die Entwicklung von ModellierungswerkzeugenADOxx: A Low-Code Platform for the Development of Modeling ToolsHMD Praxis der Wirtschaftsinformatik10.1365/s40702-024-01096-x61:5(1295-1316)Online publication date: 2-Aug-2024
  • (2024)TelarKG: a Knowledge Graph of Chile's Constitutional ProcessProceedings of the 7th Joint Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA)10.1145/3661304.3661899(1-5)Online publication date: 14-Jun-2024
  • (2024)DiscipLink: Unfolding Interdisciplinary Information Seeking Process via Human-AI Co-ExplorationProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology10.1145/3654777.3676366(1-20)Online publication date: 13-Oct-2024
  • (2024)On the Opportunities and Challenges of Foundation Models for GeoAI (Vision Paper)ACM Transactions on Spatial Algorithms and Systems10.1145/365307010:2(1-46)Online publication date: 1-Jul-2024
  • (2024)Context-based Entity Recommendation for Knowledge Workers: Establishing a Benchmark on Real-life DataProceedings of the 18th ACM Conference on Recommender Systems10.1145/3640457.3688068(654-659)Online publication date: 8-Oct-2024
  • (2024)Learner Modeling and Recommendation of Learning Resources using Personal Knowledge GraphsProceedings of the 14th Learning Analytics and Knowledge Conference10.1145/3636555.3636881(273-283)Online publication date: 18-Mar-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media