skip to main content
10.1145/1871985.1871991acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Entity-relationship queries over wikipedia

Published: 30 October 2010 Publication History

Abstract

Wikipedia is the largest user-generated knowledge base. We propose a structured query mechanism, entity-relationship query, for searching entities in Wikipedia corpus by their properties and inter-relationships. An entity-relationship query consists of arbitrary number of predicates on desired entities. The semantics of each predicate is specified with keywords. Entity-relationship query searches entities directly over text rather than pre-extracted structured data stores. This characteristic brings two benefits: (1) Query semantics can be intuitively expressed by keywords; (2) It avoids information loss that happens during extraction. We present a ranking framework for general entity-relationship queries and a position-based Bounded Cumulative Model for accurate ranking of query answers. Experiments on INEX benchmark queries and our own crafted queries show the effectiveness and accuracy of our ranking method.

References

[1]
http://tartarus.org/ martin/porterstemmer/.
[2]
http://www.w3.org/tr/rdf-sparql-query.
[3]
INEX 2009 entity-ranking track. http://www.inex.otago.ac.nz/tracks/entity- ranking/entity-ranking.asp.
[4]
TREC 2009 entity track: Searching for entities and properties of entities. http://ilps.science.uva.nl/trec-entity/guidelines/.
[5]
E. Agichtein and L. Gravano. Snowball: Extracting relations from large plain-text collections. In DL, 2000.
[6]
S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak, and Z. Ives. DBpedia: A nucleus for a Web of open data. In Int.l Semantic Web Conf., 2007.
[7]
S. Brin. Extracting patterns and relations from the world wide web. In WebDB, 1998.
[8]
M. J. Cafarella, A. Halevy, D. Z. Wang, E. Wu, and Y. Zhang. Webtables: exploring the power of tables on the web. Proc. VLDB Endow., 1(1):538--549, 2008.
[9]
M. J. Cafarella, C. Ré, D. Suciu, O. Etzioni, and M. Banko. Structured querying of Web text. In CIDR, pages 225--234, 2007.
[10]
S. Chakrabarti, K. Puniyani, and S. Das. Optimizing scoring functions and indexes for proximity search in type-annotated corpora. In WWW, 2006.
[11]
T. Cheng, X. Yan, and K. C.-C. Chang. EntityRank: searching entities directly and holistically. In VLDB, pages 387--398, 2007.
[12]
E. Chu, A. Baid, T. Chen, A. Doan, and J. Naughton. A relational approach to incrementally extracting and querying structure in unstructured data. In VLDB, pages 1045--1056, 2007.
[13]
P. DeRose, W. Shen, F. Chen, A. Doan, and R. Ramakrishnan. Building structured Web community portals: a top-down, compositional, and incremental approach. In VLDB, 2007.
[14]
S. Dill, N. Eiron, D. Gibson, D. Gruhl, R. Guha, A. Jhingran, T. Kanungo, S. Rajagopalan, A. Tomkins, J. A. Tomlin, and J. Y. Zien. SemTag and seeker: bootstrapping the semantic Web via automated semantic annotation. In WWW, 2003.
[15]
O. Etzioni, M. Banko, S. Soderland, and D. S. Weld. Open information extraction from the Web. Commun. ACM, 51(12):68--74, 2008.
[16]
E. Kandogan, R. Krishnamurthy, S. Raghavan, S. Vaithyanathan, and H. Zhu. Avatar semantic search: a database approach to information retrieval. In SIGMOD, pages 790--792, 2006.
[17]
G. Kasneci, F. Suchanek, G. Ifrim, M. Ramanath, and G. Weikum. NAGA: Searching and ranking knowledge. In ICDE, pages 953--962, 2008.
[18]
S. Kulkarni, A. Singh, G. Ramakrishnan, and S. Chakrabarti. Collective annotation of Wikipedia entities in Web text. In KDD, pages 457--466, 2009.
[19]
X. Li, C. Li, and C. Yu. Structured querying of annotation-rich web text with shallow semantics. Technical report, Univ. of Texas at Arlington, 2010.
[20]
R. Mihalcea and A. Csomai. Wikify!: linking documents to encyclopedic knowledge. In CIKM, 2007.
[21]
D. Milne and I. H. Witten. Learning to link with wikipedia. In CIKM, 2008.
[22]
Nadeau, David, Sekine, and Satoshi.
[23]
D. Petkova and W. B. Croft. Proximity-based document representation for named entity retrieval. In CIKM, 2007.
[24]
F. Suchanek. Automated Construction and Growth of a Large Ontology. PhD thesis, Saarland University, 2009.
[25]
F. M. Suchanek, G. Kasneci, and G. Weikum. YAGO: a core of semantic knowledge unifying WordNet and Wikipedia. In WWW, 2007.
[26]
A.-M. Vercoustre, J. A. Thom, and J. Pehcevski. Entity ranking in wikipedia. In SAC, 2008.
[27]
H. Zaragoza, H. Rode, P. Mika, J. Atserias, M. Ciaramita, and G. Attardi. Ranking very many typed entities on Wikipedia. In CIKM, 2007.
[28]
M. Zhou, T. Cheng, and K. C.-C. Chang. Data-oriented content query system: searching for data into text on the web. In WSDM, 2010.

Cited By

View all
  • (2012)Entity-Relationship Queries over WikipediaACM Transactions on Intelligent Systems and Technology (TIST)10.1145/2337542.23375553:4(1-20)Online publication date: 1-Sep-2012

Index Terms

  1. Entity-relationship queries over wikipedia

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SMUC '10: Proceedings of the 2nd international workshop on Search and mining user-generated contents
    October 2010
    136 pages
    ISBN:9781450303866
    DOI:10.1145/1871985
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 30 October 2010

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. entity ranking
    2. entity search
    3. structured entity query
    4. wikipedia

    Qualifiers

    • Research-article

    Conference

    CIKM '10

    Acceptance Rates

    SMUC '10 Paper Acceptance Rate 15 of 25 submissions, 60%;
    Overall Acceptance Rate 15 of 25 submissions, 60%

    Upcoming Conference

    CIKM '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 17 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2012)Entity-Relationship Queries over WikipediaACM Transactions on Intelligent Systems and Technology (TIST)10.1145/2337542.23375553:4(1-20)Online publication date: 1-Sep-2012

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media