skip to main content
10.1145/2187980.2188141acmotherconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
poster

Instrumenting a logic programming language to gather provenance from an information extraction application

Published:16 April 2012Publication History

ABSTRACT

Information extraction (IE) programs for the web consume and produce a lot of data. In order to better understand the program output, the developer and user often desire to know the details of how the output was created. Provenance can be used to learn about the creation of the output. We collect fine-grained provenance by leveraging ongoing work in the IE community to write IE programs in a logic programming language. The logic programming language exposes the semantics of the program, allowing us to gather fine-grained provenance during program execution. We discuss a case study using a web-based community information management system, then present results regarding the performance of queries over the provenance data gathered by our logic program interpreter. Our findings show that it is possible to gather useful fine-grained provenance during the execution of a logic based web information extraction program. Additionally, queries over this provenance information can be performed in a reasonable amount of time.

References

  1. P. DeRose, W. Shen, F. Chen, Y. Lee, D. Burdick, A. Doan, and R. Ramakrishnan. DBLife: A community information management platform for the database research community. In CIDR-07, 2007.Google ScholarGoogle Scholar
  2. A. Doan, R. Ramakrishnan, F. Chen, P. DeRose, Y. Lee, R. McCann, M. Sayyadian, and W. Shen. Community information management. IEEE Data Engineering Bulletin, Special Issue on Probabilistic Databases., 29(1), 2006.Google ScholarGoogle Scholar
  3. J. Freire, D. Koop, E. Santos, and C. T. Silva. Provenance for computational tasks: A survey. Computing in Science and Engineering, May/June 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. L. Moreau, B. Clifford, J. Freire, J. Futrelle, Y. Gil, P. Groth, N. Kwasnikowska, S. Miles, P. Missier, J. Myers, B. Plale, Y. Simmhan, E. Stephan, and J. V. den Bussche. The Open Provenance Model core specification (v1.1). Future Generation Computer Systems, 27(6):743--756, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. K.-K. Muniswamy-Reddy, U. Braun, D. A. Holland, P. Macko, D. Margo, M. Seltzer, and R. Smogor. Layering in provenance systems. In Proceedings of the 2009 USENIX Annual Technical Conference, San Diego, California, June 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. W. Shen, A. Doan, J. Naughton, and R. Ramakrishnan. Declarative information extraction using datalog with embedded extraction predicates. In Proceedings of the 33rd VLDB Conference, pages 1033--1044. VLDB Endowment, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Instrumenting a logic programming language to gather provenance from an information extraction application

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      WWW '12 Companion: Proceedings of the 21st International Conference on World Wide Web
      April 2012
      1250 pages
      ISBN:9781450312301
      DOI:10.1145/2187980

      Copyright © 2012 Authors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 16 April 2012

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • poster

      Acceptance Rates

      Overall Acceptance Rate1,899of8,196submissions,23%
    • Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader