skip to main content
10.1145/3340531.3417406acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
keynote

Ceres: Harvesting Knowledge from the Semi-structured Web

Published: 19 October 2020 Publication History

Abstract

Knowledge graphs have been used to support a wide range of applications and enhance search and QA for Google, Bing, Amazon Alexa, etc. However, we often miss long-tail knowledge, including unpopular entities, unpopular relations, and unpopular verticals. In this talk we describe our efforts in harvesting knowledge from semi-structured websites, which are often populated according to some templates using vast volume of data stored in underlying databases.
We describe our Ceres system, which extracts knowledge from semi-structured web. AutoCeres is a ClosedIE system that extracts knowledge according to existing ontology. It improves the accuracy of fully automatic knowledge extraction from 60%+ of state-of-the-art to 90%+ on semi-structured data. OpenCeres is the first-ever OpenIE system on semi-structured data, that is able to identify new relations not readily included in existing ontologies. ZeroShotCeres goes further and enables extracting knowledge for completely new domains, where there is no seed knowledge for bootstrapping the extraction. Finally, we describe our other efforts in ontology alignment, entity linkage, graph mining, and QA, that allow us to best leverage the knowledge we extract for search and QA.

Index Terms

  1. Ceres: Harvesting Knowledge from the Semi-structured Web

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '20: Proceedings of the 29th ACM International Conference on Information & Knowledge Management
    October 2020
    3619 pages
    ISBN:9781450368599
    DOI:10.1145/3340531
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 19 October 2020

    Check for updates

    Author Tags

    1. closedie
    2. knowledge extraction
    3. openie
    4. semi-structured data
    5. zero shot

    Qualifiers

    • Keynote

    Conference

    CIKM '20
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    CIKM '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 236
      Total Downloads
    • Downloads (Last 12 months)5
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 16 Feb 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media