skip to main content
10.1145/1376616.1376759acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
demonstration

LearnPADS: automatic tool generation from ad hoc data

Published: 09 June 2008 Publication History

Abstract

In this demonstration, we will present LEARNPADS, a fully automatic system for generating ad hoc data processing tools. When presented with a collection of ad hoc data, the system (1) analyzes the data, (2) infers a PADS [4, 5] description, (3) generates parser, printer, validation and traversal libraries and (4) links these libraries with format-independent tool suites to form stand-alone applications. These applications provide statistical analysis, XML conversion, CSV conversion, the ability to query with the Galax XQuery engine [3], and the ability to graph selected data elements, all directly from ASCII ad hoc data without human intervention. SIGMOD attendees will see both the user experience with LEARNPADS and the internals of the multi-phase inference algorithm which lies at the heart of the system.

References

[1]
A. Arasu and H. Garcia-Molina. Extracting structured data from web pages. In SIGMOD, pages 337--348, 2003.
[2]
M. F. Fernández, K. Fisher, J. N. Foster, M. Greenberg, and Y. Mandelbaum. A generic programming toolkit for PADS/ML: First-class upgrades for third-party developers. In PADL, Jan. 2008.
[3]
M. F. Fernández, K. Fisher, R. Gruber, and Y. Mandelbaum. PADX: Querying large-scale ad hoc data with XQuery. In PLAN-X, Jan. 2006.
[4]
K. Fisher and R. Gruber. PADS: A domain specific language for processing ad hoc data. In PLDI, pages 295--304, June 2005.
[5]
K. Fisher, Y. Mandelbaum, and D. Walker. The next 700 data description languages. In POPL, Jan. 2006.
[6]
K. Fisher, D. Walker, K. Zhu, and P.White. From dirt to shovels: Fully automatic tool generation from ad hoc data. In POPL, Jan. 2008.
[7]
Y. Mandelbaum, K. Fisher, D. Walker, M. Fernández, and A. Gleyzer. PADS/ML: A functional data description language. In POPL, Jan. 2007.
[8]
PADS project. http://www.padsproj.org, 2007.

Cited By

View all
  • (2023)AI Assistants: A Framework for Semi-Automated Data WranglingIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.322253835:9(9295-9306)Online publication date: 1-Sep-2023
  • (2019)Learning Data Transformations with Minimal User Effort2019 IEEE International Conference on Big Data (Big Data)10.1109/BigData47090.2019.9006350(657-664)Online publication date: Dec-2019
  • (2018)FlashProfile: a framework for synthesizing data profilesProceedings of the ACM on Programming Languages10.1145/32765202:OOPSLA(1-28)Online publication date: 24-Oct-2018
  • Show More Cited By

Index Terms

  1. LearnPADS: automatic tool generation from ad hoc data

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGMOD '08: Proceedings of the 2008 ACM SIGMOD international conference on Management of data
    June 2008
    1396 pages
    ISBN:9781605581026
    DOI:10.1145/1376616
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 09 June 2008

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. ad hoc data
    2. data description language
    3. grammar induction
    4. tools generation

    Qualifiers

    • Demonstration

    Conference

    SIGMOD/PODS '08
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 785 of 4,003 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)11
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 17 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)AI Assistants: A Framework for Semi-Automated Data WranglingIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.322253835:9(9295-9306)Online publication date: 1-Sep-2023
    • (2019)Learning Data Transformations with Minimal User Effort2019 IEEE International Conference on Big Data (Big Data)10.1109/BigData47090.2019.9006350(657-664)Online publication date: Dec-2019
    • (2018)FlashProfile: a framework for synthesizing data profilesProceedings of the ACM on Programming Languages10.1145/32765202:OOPSLA(1-28)Online publication date: 24-Oct-2018
    • (2015)Ontology-Driven Data Semantics Discovery for Cyber-SecurityProceedings of the 17th International Symposium on Practical Aspects of Declarative Languages - Volume 913110.1007/978-3-319-19686-2_1(1-16)Online publication date: 18-Jun-2015
    • (2013)Automatically synthesizing SQL queries from input-output examplesProceedings of the 28th IEEE/ACM International Conference on Automated Software Engineering10.1109/ASE.2013.6693082(224-234)Online publication date: 11-Nov-2013
    • (2012)LearnPADS++Proceedings of the 14th international conference on Practical Aspects of Declarative Languages10.1007/978-3-642-27694-1_13(168-182)Online publication date: 23-Jan-2012
    • (2011)Forensic triage for mobile phones with DEC0DEProceedings of the 20th USENIX conference on Security10.5555/2028067.2028074(7-7)Online publication date: 8-Aug-2011
    • (2011)Bistro data feed management systemProceedings of the 2011 ACM SIGMOD International Conference on Management of data10.1145/1989323.1989437(1059-1070)Online publication date: 12-Jun-2011
    • (2010)Optimizing data analysis with a semi-structured time series databaseProceedings of the 2010 workshop on Managing systems via log analysis and machine learning techniques10.5555/1928991.1929002(7-7)Online publication date: 3-Oct-2010
    • (2010)Reverse engineering for mobile systems forensics with AresProceedings of the 2010 ACM workshop on Insider threats10.1145/1866886.1866892(21-28)Online publication date: 8-Oct-2010
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media