skip to main content
10.1145/3077136.3080756acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
short-paper

RELink: A Research Framework and Test Collection for Entity-Relationship Retrieval

Published: 07 August 2017 Publication History

Abstract

Improvements of entity-relationship (E-R) search techniques have been hampered by a lack of test collections, particularly for complex queries involving multiple entities and relationships. In this paper we describe a method for generating E-R test queries to support comprehensive E-R search experiments. Queries and relevance judgments are created from content that exists in a tabular form where columns represent entity types and the table structure implies one or more relationships among the entities. Editorial work involves creating natural language queries based on relationships represented by the entries in the table. We have publicly released the RELink test collection comprising 600 queries and relevance judgments obtained from a sample of Wikipedia List-of-lists-of-lists tables. The latter comprise tuples of entities that are extracted from columns and labelled by corresponding entity types and relationships they represent. In order to facilitate research in complex E-R retrieval, we have created and released as open source the RELink Framework that includes Apache Lucene indexing and search specifically tailored to E-R retrieval. RELink includes entity and relationship indexing based on the ClueWeb-09-B Web collection with FACC1 text span annotations linked to Wikipedia entities. With ready to use search resources and a comprehensive test collection, we support community in pursuing E-R research at scale.

References

[1]
Chandra Sekhar Bhagavatula, Thanapon Noraset, and Doug Downey. 2013. Methods for exploring and mining tables on wikipedia ACM SIGKDD Workshop on Interactive Data Exploration and Analytics. 18--26.
[2]
Jack G Conrad and Mary Hunter Utt 1994. A system for discovering relationships by feature extraction from text databases SIGIR' 94. 260--270.
[3]
Shady Elbassuoni, Maya Ramanath, Ralf Schenkel, Marcin Sydow, and Gerhard Weikum 2009. Language-model-based ranking for queries on RDF-graphs CIKM. ACM, 977--986.
[4]
Evgeniy Gabrilovich, Michael Ringgaard, and Amarnag Subramanya 2013. FACC1: Freebase annotation of ClueWeb corpora. (2013).
[5]
Oliver Lehmberg, Dominique Ritze, Robert Meusel, and Christian Bizer 2016. A large public corpus of web tables containing time and context metadata WWW. 75--76.
[6]
Xiaonan Li, Chengkai Li, and Cong Yu 2012. Entity-relationship queries over wikipedia. ACM TIST, Vol. 3, 4 (2012), 70.
[7]
Donald Metzler and W Bruce Croft 2005. A Markov random field model for term dependencies. SIGIR. ACM, 472--479.
[8]
Jeffrey Pound, Alexander K Hudek, Ihab F Ilyas, and Grant Weddell 2012. Interpreting keyword queries over web knowledge bases CIKM. ACM, 305--314.
[9]
Uma Sawant and Soumen Chakrabarti 2013. Learning joint query interpretation and response ranking WWW. ACM, 1099--1110.
[10]
Michael Schmitz, Robert Bart, Stephen Soderland, Oren Etzioni, et almbox. 2012. Open language learning for information extraction. EMNLP-CoNLL. Association for Computational Linguistics, 523--534.
[11]
Mohamed Yahya, Denilson Barbosa, Klaus Berberich, Qiuyue Wang, and Gerhard Weikum 2016. Relationship queries on extended knowledge graphs. WSDM. ACM, 605--614.
[12]
Mohamed Yahya, Klaus Berberich, Shady Elbassuoni, Maya Ramanath, Volker Tresp, and Gerhard Weikum 2012. Natural language questions for the web of data. In EMNLP-CoNLL. Association for Computational Linguistics, 379--390.

Cited By

View all
  • (2019)Data Lineage Approach of Multi-version Documents Traceability in Complex Software EngineeringIntelligent Computing Methodologies10.1007/978-3-030-26766-7_45(491-502)Online publication date: 24-Jul-2019
  • (2018)On-the-fly Table GenerationThe 41st International ACM SIGIR Conference on Research & Development in Information Retrieval10.1145/3209978.3209988(595-604)Online publication date: 27-Jun-2018

Index Terms

  1. RELink: A Research Framework and Test Collection for Entity-Relationship Retrieval

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '17: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval
    August 2017
    1476 pages
    ISBN:9781450350228
    DOI:10.1145/3077136
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 August 2017

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tag

    1. entity-relationship retrieval

    Qualifiers

    • Short-paper

    Conference

    SIGIR '17
    Sponsor:

    Acceptance Rates

    SIGIR '17 Paper Acceptance Rate 78 of 362 submissions, 22%;
    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 16 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2019)Data Lineage Approach of Multi-version Documents Traceability in Complex Software EngineeringIntelligent Computing Methodologies10.1007/978-3-030-26766-7_45(491-502)Online publication date: 24-Jul-2019
    • (2018)On-the-fly Table GenerationThe 41st International ACM SIGIR Conference on Research & Development in Information Retrieval10.1145/3209978.3209988(595-604)Online publication date: 27-Jun-2018

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media