Abstract
This paper develops a database query language called Transducer Datalog motivated by the needs of a new and emerging class of database applications. In these applications, such as text databases and genome databases, the storage and manipulation of long character sequences is a crucial feature. The issues involved in managing this kind of data are not addressed by traditional database systems, either in theory or in practice. To address these issues, in recent work, we introduced a new machine model called a generalized sequence transducer. These generalized transducers extend ordinary transducers by allowing them to invoke other transducers as “subroutines.” This paper establishes the computational properties of Transducer Datalog, a query language based on this new machine model. In the process, we develop a hierarchy of time-complexity classes based on the Ackermann function. The lower levels of this hierarchy correspond to well-known complexity classes, such as polynomial time and hyper-exponential time. We establish a tight relationship between levels in this hierarchy and the depth of subroutine calls within Transducer Datalog programs. Finally, we show that Transducer Datalog programs of arbitrary depth express exactly the sequence functions computable in primitive-recursive time.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
S. Abiteboul and P. Kanellakis. Object identity as a query language primitive. In ACM SIGMOD International Conf. on Management of Data, pages 159–173, 1989.
W. Ackermann. Zum Hilbertschen Aufbau der reellen Zahlen. Math. Annalen, 99:118–133, 1928.
K. Apt, H. Blair, and A. Walker. Towards a theory of declarative knowledge. In J. Minker, editor, Foundations of Deductive Databases and Logic Programming, pages 89–148. Morgan Kauffman, Los Altos, 1988.
F. Bancilhon, S. Cluet, and C. Delobel. A query language for the O2 object-oriented database system. In Second Intern. Workshop on Database Programming Languages (DBPL'89), pages 122–138, 1989.
A. J. Bonner and G. Mecca. Querying string databases with Transducers. http://poincare.inf.uniroma3.it:8080,1997.
A. J. Bonner and G. Mecca. Sequences, Datalog and Transducers. Journal of Computing and System Sciences, Special Issue on PODS'95, 1997. To Appear. http://poincare.inf.uniroma3.it:8080.
R. G. G. Cattel. The Object Database Standard ODMG-93. Morgan Kaufmann Publishers, San Francisco, CA, 1994.
S. Ceri, G. Gottlob, and L. Tanca. Logic Programming and Data Bases. Springer-Verlag, 1989.
A. K. Chandra and D. Harel. Computable queries for relational databases. Journal of Computing and System Sciences, 21:333–347, 1980.
Communications of the ACM. Special issue on the Human Genome project. vol. 34(11), November 1991.
S. Ginsburg and X. Wang. Pattern matching by RS-operations: towards a unified approach to querying sequence data. In Eleventh ACM SIGACT SIGMOD SIGART Symp. on Principles of Database Systems (PODS'92), pages 293–300, 1992.
G. H. Gonnet. Text dominated databases: Theory, practice and experience. Tutorial presented at PODS, 1994.
N. Goodman. Research issues in Genome databases. Tutorial presented at PODS, 1995.
G. Grahne, M. Nykanen, and E. Ukkonen. Reasoning about strings in databases. In Thirteenth ACM SIGMOD Intern. Symposium on Principles of Database Systems (PODS'94), pages 303–312, 1994.
S. Grumbach and T. Milo. An algebra for POMSETS. In Fifth International Conference on Data Base Theory, (ICDT'95), Prague, Lecture Notes in Computer Science, pages 191–207, 1995.
A. Grzegorczyk. Some classes of recursive functions. Rozprazvy Matematyczne, 4, 1953. Instytut Matematyczne Polskiej Akademie Nauk, Warsaw.
C. Hegelsen and P. R. Sibbald. PALM — a pattern language for molecular biology. In First Intern. Conference on Intelligent Systems for Molecular Biology, pages 172–180, 1993.
V. A. Kozmidiadi. On a generalization of finite automata generating a hierarchy analogous to the A. Grzegorczyk's classification of the primitive recursive functions. Problemi Kibernetiki, 23:127–170, 1970.
J. W. Lloyd. Foundations of Logic Programming. Springer-Verlag, second edition, 1987.
G. Mecca. From Datalog to Sequence Datalog: Languages and Techniques for Querying Sequence Databases. PhD thesis, Università di Roma “La Sapienza”, Dipartimento di Informatica e Sistemistica, 1996.http://poincare.inf.-uniroma3.it:8080.
G. Mecca and A. J. Bonner. Finite query languages for sequence databases. In Fifth Intern. Workshop on Database Programming Languages (DBPL'95), Gubbio, Italy. electronic Workshops in Computing-Springer-Verlag, 1995. http://www.springer.co.uk/eWiC/Workshops/DBPL5.html.
G. Mecca and A. J. Bonner. Sequences, Datalog and Transducers. In Fourteenth ACM SIGMOD Intern. Symposium on Principles of Database Systems (PODS'95), San Jose, California, pages 23–35, 1995. http://poincare.inf.uniroma3.it:8080.
A. R. Meyer and D. M. Ritchie. Computational complexity and program structure. I.B.M. Res. Rep., 1817, 1967.
C. H. Papadimitriou. Computational Complexity. Addison-Wesley, 1994.
H. Rogers, Jr. Theory of recursive functions and effective computability. MIT Press, Cambridge, Mass., 1987.
D. B. Searls. String Variable Grammars: a logic grammar formalism for dna sequences. Technical report, University of Pennsylvania, School of Medicine, 1993.
D. Stott Parker, E. Simon, and P. Valduriez. SVP — a model capturing sets, streams and parallelism. In Eighteenth International Conference on Very Large Data Bases (VLDB'92), Vancouver, Canada, pages 115–126, 1992.
M. Vardi. The complexity of relational query languages.In Fourteenth ACM SIGACT Symp. on Theory of Computing, pages 137–146, 1988.
J. D. Watson et al. Molecular biology of the gene. Benjamin and Cummings Publ. Co., Menlo Park, California, fourth edition, 1987.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bonner, A.J., Mecca, G. (1998). Querying sequence databases with transducers. In: Cluet, S., Hull, R. (eds) Database Programming Languages. DBPL 1997. Lecture Notes in Computer Science, vol 1369. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-64823-2_8
Download citation
DOI: https://doi.org/10.1007/3-540-64823-2_8
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64823-9
Online ISBN: 978-3-540-68534-0
eBook Packages: Springer Book Archive