Skip to main content

Querying sequence databases with transducers

  • Query Languages for New Applications
  • Conference paper
  • First Online:
Database Programming Languages (DBPL 1997)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1369))

Included in the following conference series:

  • 2829 Accesses

Abstract

This paper develops a database query language called Transducer Datalog motivated by the needs of a new and emerging class of database applications. In these applications, such as text databases and genome databases, the storage and manipulation of long character sequences is a crucial feature. The issues involved in managing this kind of data are not addressed by traditional database systems, either in theory or in practice. To address these issues, in recent work, we introduced a new machine model called a generalized sequence transducer. These generalized transducers extend ordinary transducers by allowing them to invoke other transducers as “subroutines.” This paper establishes the computational properties of Transducer Datalog, a query language based on this new machine model. In the process, we develop a hierarchy of time-complexity classes based on the Ackermann function. The lower levels of this hierarchy correspond to well-known complexity classes, such as polynomial time and hyper-exponential time. We establish a tight relationship between levels in this hierarchy and the depth of subroutine calls within Transducer Datalog programs. Finally, we show that Transducer Datalog programs of arbitrary depth express exactly the sequence functions computable in primitive-recursive time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. S. Abiteboul and P. Kanellakis. Object identity as a query language primitive. In ACM SIGMOD International Conf. on Management of Data, pages 159–173, 1989.

    Google Scholar 

  2. W. Ackermann. Zum Hilbertschen Aufbau der reellen Zahlen. Math. Annalen, 99:118–133, 1928.

    Google Scholar 

  3. K. Apt, H. Blair, and A. Walker. Towards a theory of declarative knowledge. In J. Minker, editor, Foundations of Deductive Databases and Logic Programming, pages 89–148. Morgan Kauffman, Los Altos, 1988.

    Google Scholar 

  4. F. Bancilhon, S. Cluet, and C. Delobel. A query language for the O2 object-oriented database system. In Second Intern. Workshop on Database Programming Languages (DBPL'89), pages 122–138, 1989.

    Google Scholar 

  5. A. J. Bonner and G. Mecca. Querying string databases with Transducers. http://poincare.inf.uniroma3.it:8080,1997.

    Google Scholar 

  6. A. J. Bonner and G. Mecca. Sequences, Datalog and Transducers. Journal of Computing and System Sciences, Special Issue on PODS'95, 1997. To Appear. http://poincare.inf.uniroma3.it:8080.

    Google Scholar 

  7. R. G. G. Cattel. The Object Database Standard ODMG-93. Morgan Kaufmann Publishers, San Francisco, CA, 1994.

    Google Scholar 

  8. S. Ceri, G. Gottlob, and L. Tanca. Logic Programming and Data Bases. Springer-Verlag, 1989.

    Google Scholar 

  9. A. K. Chandra and D. Harel. Computable queries for relational databases. Journal of Computing and System Sciences, 21:333–347, 1980.

    Google Scholar 

  10. Communications of the ACM. Special issue on the Human Genome project. vol. 34(11), November 1991.

    Google Scholar 

  11. S. Ginsburg and X. Wang. Pattern matching by RS-operations: towards a unified approach to querying sequence data. In Eleventh ACM SIGACT SIGMOD SIGART Symp. on Principles of Database Systems (PODS'92), pages 293–300, 1992.

    Google Scholar 

  12. G. H. Gonnet. Text dominated databases: Theory, practice and experience. Tutorial presented at PODS, 1994.

    Google Scholar 

  13. N. Goodman. Research issues in Genome databases. Tutorial presented at PODS, 1995.

    Google Scholar 

  14. G. Grahne, M. Nykanen, and E. Ukkonen. Reasoning about strings in databases. In Thirteenth ACM SIGMOD Intern. Symposium on Principles of Database Systems (PODS'94), pages 303–312, 1994.

    Google Scholar 

  15. S. Grumbach and T. Milo. An algebra for POMSETS. In Fifth International Conference on Data Base Theory, (ICDT'95), Prague, Lecture Notes in Computer Science, pages 191–207, 1995.

    Google Scholar 

  16. A. Grzegorczyk. Some classes of recursive functions. Rozprazvy Matematyczne, 4, 1953. Instytut Matematyczne Polskiej Akademie Nauk, Warsaw.

    Google Scholar 

  17. C. Hegelsen and P. R. Sibbald. PALM — a pattern language for molecular biology. In First Intern. Conference on Intelligent Systems for Molecular Biology, pages 172–180, 1993.

    Google Scholar 

  18. V. A. Kozmidiadi. On a generalization of finite automata generating a hierarchy analogous to the A. Grzegorczyk's classification of the primitive recursive functions. Problemi Kibernetiki, 23:127–170, 1970.

    Google Scholar 

  19. J. W. Lloyd. Foundations of Logic Programming. Springer-Verlag, second edition, 1987.

    Google Scholar 

  20. G. Mecca. From Datalog to Sequence Datalog: Languages and Techniques for Querying Sequence Databases. PhD thesis, Università di Roma “La Sapienza”, Dipartimento di Informatica e Sistemistica, 1996.http://poincare.inf.-uniroma3.it:8080.

    Google Scholar 

  21. G. Mecca and A. J. Bonner. Finite query languages for sequence databases. In Fifth Intern. Workshop on Database Programming Languages (DBPL'95), Gubbio, Italy. electronic Workshops in Computing-Springer-Verlag, 1995. http://www.springer.co.uk/eWiC/Workshops/DBPL5.html.

    Google Scholar 

  22. G. Mecca and A. J. Bonner. Sequences, Datalog and Transducers. In Fourteenth ACM SIGMOD Intern. Symposium on Principles of Database Systems (PODS'95), San Jose, California, pages 23–35, 1995. http://poincare.inf.uniroma3.it:8080.

    Google Scholar 

  23. A. R. Meyer and D. M. Ritchie. Computational complexity and program structure. I.B.M. Res. Rep., 1817, 1967.

    Google Scholar 

  24. C. H. Papadimitriou. Computational Complexity. Addison-Wesley, 1994.

    Google Scholar 

  25. H. Rogers, Jr. Theory of recursive functions and effective computability. MIT Press, Cambridge, Mass., 1987.

    Google Scholar 

  26. D. B. Searls. String Variable Grammars: a logic grammar formalism for dna sequences. Technical report, University of Pennsylvania, School of Medicine, 1993.

    Google Scholar 

  27. D. Stott Parker, E. Simon, and P. Valduriez. SVP — a model capturing sets, streams and parallelism. In Eighteenth International Conference on Very Large Data Bases (VLDB'92), Vancouver, Canada, pages 115–126, 1992.

    Google Scholar 

  28. M. Vardi. The complexity of relational query languages.In Fourteenth ACM SIGACT Symp. on Theory of Computing, pages 137–146, 1988.

    Google Scholar 

  29. J. D. Watson et al. Molecular biology of the gene. Benjamin and Cummings Publ. Co., Menlo Park, California, fourth edition, 1987.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Sophie Cluet Rick Hull

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bonner, A.J., Mecca, G. (1998). Querying sequence databases with transducers. In: Cluet, S., Hull, R. (eds) Database Programming Languages. DBPL 1997. Lecture Notes in Computer Science, vol 1369. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-64823-2_8

Download citation

  • DOI: https://doi.org/10.1007/3-540-64823-2_8

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-64823-9

  • Online ISBN: 978-3-540-68534-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics