skip to main content
10.1145/1999676.1999696acmconferencesArticle/Chapter ViewAbstractPublication Pagesk-capConference Proceedingsconference-collections
research-article

RDR-based open IE for the web document

Published: 26 June 2011 Publication History

Abstract

The Web contains a massive amount of information embedded in text and obtaining information from Web text is a major research challenge. One research focus is Open Information Extraction aimed at developing relation-independent information extraction. Open Information Extraction (OIE) systems seek to extract all potential relations from the text rather than extracting a few pre-defined relations. Existing OIE systems such as TEXTRUNNER usually take a machine learning based approach which requires large volumes of training data. This paper presents a Ripple-Down Rules Open Information Extraction system based on processing example cases and manually adding rules when needed. The key advantages of this approach are that it can handle the freer writing style that occurs in Web documents and can correct errors introduced by natural language pre-processing tools, whereas systems like TEXTRUNNER depend on the quality of the entity-tagging preprocessing in the training data. We evaluated the Ripple-Down Rules approach against the OIE systems, TEXTRUNNER and StatSnowball. In these studies the Ripple-Down Rules approach, with minimal low-cost rule addition achieves much higher precision and somewhat improved recall compared to these other Open Information Extraction systems.

References

[1]
Banko, M., Cafarella, M. J., Soderland, S., Broadhead, M., and Etzioni, O. 2007. Open information extraction from the web. In Proceedings of the 20th international joint conference on Artificial intelligence.
[2]
Banko, M., & Etzioni, O. 2008. The Tradeoffs Between Open and Traditional Relation Extraction. In Proceedings of ACL-08: HLT.
[3]
Bunescu, R. C., & Mooney, R. J. 2007. Learning to Extract Relations from the Web using Minimal Supervision. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, Prague, Czech Republic.
[4]
Collot, M., and Belmore, N. 1996. Electronic Language: A New Variety of English. In Computer-Mediated Communications: Linguistic, Social and Cross-Cultural Perspectives, Ed. S.C. Herring, (1996) Benjamins, Amsterdam, 129--46.
[5]
Edwards, G., and compton, P. 1993. PEIRS: a pathologist-maintained expert system for the interpretation of chemical pathology reports. Pathology, 25(1), 27--34.
[6]
Gaines, B. R., and Compton, P. 1992. Induction of ripple-down rules. In Proceedngs of the 5th Australian Conference on Artificial Intelligence, Hobart, Tasmania.
[7]
Kang, B. H., Compton, P., and Preston, P. 1995. Multiple Classification Ripple Down Rules : Evaluation and Possibilities. In the 9th Banff Knowledge Acquisition for Knowledge Based Systems Workshop.
[8]
Pham, S. B., and Hoffmann, A. 2004. Extracting Positive Attributions from Scientific Papers. In Proceedings of the Discovery Science Conference.
[9]
Pham, S. B., and Hoffmann, A. 2006. Efficient Knowledge Acquisition for Extracting Temporal Relations. In the Proceedings of 17th European Conference on Artificial Intelligence, Riva del Garda, Italy.
[10]
Scheffer, T. 1996. Algebraic foundations and improved methods of induction or ripple down rules. In the 2nd Pacific Rim Knowledge Acquisition Workshop.
[11]
Sekine, S. 2006. On-demand information extraction. In the Proceedings of the COLING/ACL.
[12]
Shinyama, Y., and Sekine, S. 2006. Preemptive information extraction using unrestricted relation discovery. In the Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics.
[13]
Wu, F., and Weld, D. S. 2010. Open Information Extraction using Wikipedia. In the the 48th Annual Meeting of the Association for Computational Linguistics, Uppsala, Sweden.
[14]
Xu, H., and Hoffmann, A. 2010. RDRCE: Combining Machine Learning and Knowledge Acquisition. In the Pacific Rim Knowledge Acquisition Workshop.
[15]
Zhu, J., Nie, Z., Liu, X., Zhang, B., and Wen, J.-R. 2009. Stat Snowball: a statistical approach to extracting entity relationships. In the Proceedings of the 18th international conference on World wide web.
[16]
V. H. Ho, P. Compton, B. Benatallah, J. Vayssiere, L. Menzel, and H. Vogler. 2009. An incremental knowledge acquisition method for improving duplicate invoices detection. In Proceedings of the ICDE, 1415--1418.

Cited By

View all
  • (2021)Dependency Parsing-based Entity Relation Extraction over Chinese Complex TextACM Transactions on Asian and Low-Resource Language Information Processing10.1145/345027320:4(1-34)Online publication date: 9-Jun-2021
  • (2016)Building a Process Description Repository with Knowledge AcquisitionKnowledge Management and Acquisition for Intelligent Systems10.1007/978-3-319-42706-5_7(86-101)Online publication date: 7-Aug-2016
  • (2015)LEXAExpert Systems with Applications: An International Journal10.1016/j.eswa.2015.04.02242:17(6391-6407)Online publication date: 1-Oct-2015
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
K-CAP '11: Proceedings of the sixth international conference on Knowledge capture
June 2011
212 pages
ISBN:9781450303965
DOI:10.1145/1999676
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 June 2011

Permissions

Request permissions for this article.

Check for updates

Author Tag

  1. ripple-down rules

Qualifiers

  • Research-article

Conference

K-CAP '2011
Sponsor:
K-CAP '2011: Knowledge Capture Conference
June 26 - 29, 2011
Alberta, Banff, Canada

Acceptance Rates

Overall Acceptance Rate 55 of 198 submissions, 28%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2021)Dependency Parsing-based Entity Relation Extraction over Chinese Complex TextACM Transactions on Asian and Low-Resource Language Information Processing10.1145/345027320:4(1-34)Online publication date: 9-Jun-2021
  • (2016)Building a Process Description Repository with Knowledge AcquisitionKnowledge Management and Acquisition for Intelligent Systems10.1007/978-3-319-42706-5_7(86-101)Online publication date: 7-Aug-2016
  • (2015)LEXAExpert Systems with Applications: An International Journal10.1016/j.eswa.2015.04.02242:17(6391-6407)Online publication date: 1-Oct-2015
  • (2014)HAUSS: Incrementally building a summarizer combining multiple techniquesInternational Journal of Human-Computer Studies10.1016/j.ijhcs.2014.03.00272:7(584-605)Online publication date: Jul-2014
  • (2012)Combining different summarization techniques for legal textProceedings of the Workshop on Innovative Hybrid Approaches to the Processing of Textual Data10.5555/2388632.2388647(115-123)Online publication date: 23-Apr-2012
  • (2012)Improving the performance of a named entity recognition system with knowledge acquisitionProceedings of the 18th international conference on Knowledge Engineering and Knowledge Management10.1007/978-3-642-33876-2_11(97-113)Online publication date: 8-Oct-2012
  • (2012)Improving open information extraction for informal web documents with ripple-down rulesProceedings of the 12th Pacific Rim conference on Knowledge Management and Acquisition for Intelligent Systems10.1007/978-3-642-32541-0_14(160-174)Online publication date: 6-Sep-2012
  • (2012)Knowledge acquisition for categorization of legal case reportsProceedings of the 12th Pacific Rim conference on Knowledge Management and Acquisition for Intelligent Systems10.1007/978-3-642-32541-0_10(118-132)Online publication date: 6-Sep-2012

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media