abstract

Enabling entity retrieval by exploiting wikipedia as a semantic knowledge source

Author:
Sofia J. Athenikos

Drexel University, Philadelphia, PA

Drexel University, Philadelphia, PA
View Profile

Authors Info & Claims

ACM SIGIR Forum Volume 46 Issue 1June 2012pp 80https://doi.org/10.1145/2215676.2215687

Published:20 May 2012Publication History

ACM SIGIR Forum

Abstract

This dissertation research, PanAnthropon FilmWorld, aims to demonstrate direct retrieval of entities and related facts by exploiting Wikipedia as a semantic knowledge source, with the film domain as its proof-of-concept domain of application. To this end, a semantic knowledge base concerning the film domain has been constructed with the data extracted/derived from 10,640 Wikipedia pages on films and additional pages on film awards. The knowledge base currently contains 209,266 entities and 2,345,931 entity-centric facts. Both the knowledge base and the corresponding semantic search interface are based on the coherent classification of entities. Entity-centric facts are also consistently represented as <entity, attribute, value, note> tuples. The semantic search interface (http://dlib.ischool.drexel.edu:8080/sofia/PA/) supports multiple types of semantic search functions, which go beyond the traditional keyword-based search function, including the main General Entity Retrieval Query (GERQ) function, which is concerned with retrieving all entities that match the specified entity type, subtype, and semantic conditions and thus corresponds to the main research problem. Two types of evaluation have been performed in order to evaluate (1) the quality of information extraction and (2) the effectiveness of information retrieval using the semantic interface. The first type of evaluation has been performed by inspecting 11,495 film-centric facts concerning 100 films. The results have confirmed high data quality with 99.96% average precision and 99.84% average recall. The second type of evaluation has been performed by conducting an experiment with human subjects. The experiment involved having the subjects perform a retrieval task by using both the PanAnthropon interface and the Internet Movie Database (IMDb) interface and comparing their task performance between the two interfaces. The results have confirmed higher effectiveness of the PanAnthropon interface vs. the IMDb interface (83.11% vs. 40.78% average precision; 83.55% vs. 40.26% average recall). Moreover, the subjects' responses to the post-task questionnaire indicate that the subjects found the PanAnthropon interface to be highly usable and easily understandable as well as highly effective. The main contribution from this research therefore consists in achieving the set research goal, namely, demonstrating the utility and feasibility of semantics-based direct entity retrieval.

Index Terms

Enabling entity retrieval by exploiting wikipedia as a semantic knowledge source
1. Information systems
  1. Information retrieval

Index terms have been assigned to the content through auto-classification.

Recommendations

Enabling entity retrieval by exploiting wikipedia as a semantic knowledge source
Read More
Enabling type/condition-specified entity/fact retrieval using semantic knowledge extracted from wikipedia
SMER '11: Proceedings of the 1st international workshop on Search and mining entity-relationship data

Wikipedia has recently become an important semantic knowledge resource, thanks to its semi-structured semantic features and the huge amount of user-generated content covering a wide range of topics. The mode of information retrieval on Wikipedia, as on ...
Read More
Named entity disambiguation by leveraging wikipedia semantic knowledge
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management

Name ambiguity problem has raised an urgent demand for efficient, high-quality named entity disambiguation methods. The key problem of named entity disambiguation is to measure the similarity between occurrences of names. The traditional methods measure ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM SIGIR Forum Volume 46, Issue 1
June 2012
90 pages
ISSN:0163-5840
DOI:10.1145/2215676
Issue’s Table of Contents

Copyright © 2012 Author
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 20 May 2012
Check for updates
Qualifiers
- abstract
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 78
  Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Enabling entity retrieval by exploiting wikipedia as a semantic knowledge source

ACM SIGIR Forum

Abstract

Cited By

Index Terms

Recommendations

Enabling entity retrieval by exploiting wikipedia as a semantic knowledge source

Enabling type/condition-specified entity/fact retrieval using semantic knowledge extracted from wikipedia

Named entity disambiguation by leveraging wikipedia semantic knowledge

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Enabling entity retrieval by exploiting wikipedia as a semantic knowledge source

ACM SIGIR Forum

Abstract

Cited By

Index Terms

Recommendations

Enabling entity retrieval by exploiting wikipedia as a semantic knowledge source

Enabling type/condition-specified entity/fact retrieval using semantic knowledge extracted from wikipedia

Named entity disambiguation by leveraging wikipedia semantic knowledge

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media