ABSTRACT
Massive amounts of useful data are stored and processed in ad hoc formats for which common tools like parsers, printers, query engines and format converters are not readily available. In this paper, we explain the design and implementation of PADS/ML , a new language and system that facilitates the generation of data processing tools for ad hoc formats. The PADS/ML design includes features such as dependent, polymorphic and recursive datatypes, which allow programmers to describe the syntax and semantics of ad hoc data in a concise, easy-to-read notation. The PADS/ML implementation compiles these descriptions into ml structures and functors that include types for parsed data, functions for parsing and printing, and auxiliary support for user-specified, format-dependent and format-independent tool generation.
- D. Dreyer. Understanding and Evolving the ML Module System. PhD thesis, CMU, May 2005. Google ScholarDigital Library
- K. Fisher and R. Gruber. PADS: A domain specific language for processing ad hoc data. In ACM Conference on Programming Language Design and Implementation, pages 295--304. ACM Press, June 2005. Google ScholarDigital Library
- K. Fisher, Y. Mandelbaum, and D. Walker. The next 700 data description languages. In ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 2--15, Jan. 2006. Google ScholarDigital Library
- Y. Mandelbaum. The Theory and Practice of Data Description. PhD thesis, Princeton University, September 2006. Google ScholarDigital Library
- Y. Mandelbaum, K. Fisher, D. Walker, M. Fernandez, and A. Gleyzer. PADS/ML: A functional data description language. Technical Report TR-761-06, Princeton University, July 2006.Google Scholar
- Tree formats. Workshop on molecular evolution. http://workshop.molecularevolution.org/resources/fileformats/tree_formats.php.Google Scholar
Index Terms
- PADS/ML: a functional data description language
Recommendations
PADS/ML: a functional data description language
Proceedings of the 2007 POPL ConferenceMassive amounts of useful data are stored and processed in ad hoc formats for which common tools like parsers, printers, query engines and format converters are not readily available. In this paper, we explain the design and implementation of PADS/ML , ...
The next 700 data description languages
In the spirit of Landin, we present a calculus of dependent types to serve as the semantic foundation for a family of languages called data description languages. Such languages, which include pads, datascript, and packettypes, are designed to ...
The PADS project: an overview
ICDT '11: Proceedings of the 14th International Conference on Database TheoryThe goal of the PADS project, which started in 2001, is to make it easier for data analysts to extract useful information from ad hoc data files. This paper does not report new results, but rather gives an overview of the project and how it helps bridge ...
Comments