Skip to main content

Genome Identification and Classification by Short Oligo Arrays

  • Conference paper
Algorithms in Bioinformatics (WABI 2004)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 3240))

Included in the following conference series:

Abstract

We explore the problem of designing oligonucleotides that help locate organisms along a known phylogenetic tree. We develop a suffix-tree based algorithm to find such short sequences efficiently. Our algorithm requires O(Nm) time and O(N) space in the worst case where m is the number of the genomes classified by the phylogeny and N is their total length. We implemented our algorithm and used it to find these discriminating sequences in both small and large phylogenies. We believe our algorithm will have wide applications including: high-throughput classification and identification, oligo array design optimally differentiating genes in gene families, and markers for closely related strains and populations. It will also have scientific significance as a new way to assess the confidence in a given classification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Velculescu, V., Zhang, L., Vogelstein, B., Kinzler, K.: Serial analysis of gene expression. Science 270, 484–487 (1995)

    Article  Google Scholar 

  2. Adams, M., Kelley, J., Gocayne, J., Dubnick, M., Polymeropoulos, M., Xiao, H., Merril, C.R., et al.: Complementary DNA sequencing: expressed sequence tags and human genome project. Science 252, 1651–1656 (1991)

    Article  Google Scholar 

  3. Olson, M., Hood, L., Cantor, C., Botstein, D.: A common language for physical mapping of the human genome. Science 245, 1434–1435 (1989)

    Article  Google Scholar 

  4. Hebert, P., Cywinska, A., Ball, S., de Waard, J.: Biological identifications through DNA barcodes. In: Proc. of the Royal Society of London, vol. 270, pp. 313–321 (2003)

    Google Scholar 

  5. Onodera, K., Melcher, U.: Viroligo: a database of virus-specific oligonucleotides. Nucl. Acids. Res. 30, 203–204 (2002)

    Article  Google Scholar 

  6. Ashelford, K.E., Weightman, A.J., Fry, J.C.: Primrose: a computer program for generating and estimating the phylogenetic range of 16S rRNA oligonucleotide probes and primers in conjunction with the rdp-ii database. Nucl. Acids. Res. 30, 3481–3489 (2002)

    Article  Google Scholar 

  7. Amann, R., Ludwig, W.: Ribosomal rna-targeted nucleic acid probes for studies in microbial ecology. FEMS Microbiology Reviews 24, 555–565 (2000)

    Article  Google Scholar 

  8. Matveeva, O.V., Shabalina, S.A., Nemtsov, V.A., Tsodikov, A.D., Gesteland, R.F., Atkins, J.F.: hermodynamic calculations and statistical correlations for oligoprobes design. Nucl. Acids. Res. 31, 4211–4217 (2003)

    Article  Google Scholar 

  9. Kaderali, L., Schliep, A.: Selecting signature oligonucleotides to identify organisms using DNA arrays. Bioinformatics 18, 1340–1349 (2002)

    Article  Google Scholar 

  10. Frieze, A.M., Halldorsson, B.V.: Optimal sequencing by hybridization in rounds. Journal of Computational Biology 9, 355–369 (2002)

    Article  Google Scholar 

  11. Mitsuhashi, M., Cooper, A., Ogura, M., Shinagawa, T., Yano, K., Hosokawa, T.: Oligonucleotide probe design - a new approach. Nature 367, 759–761 (1994)

    Article  Google Scholar 

  12. Gusfield, D.: Algorithms on Strings, Trees, and Sequences. Cambridge University Press, New York (1997)

    Book  MATH  Google Scholar 

  13. Thomas, J., et al.: Comparative analyses of multi-species sequences from targeted genomic regions. Nature 424, 788–793 (2003)

    Article  Google Scholar 

  14. Maidak, B.L., Cole, J.R., Lilburn, T.G., Parker, Charles T., J., Sax man, P.R., Farris, R.J., Garrity, G.M., Olsen, G.J., Schmidt, T.M., Tie dje, J.M.: The rdp-ii (ribosomal database project). Nucl. Acids. Res. 29, 173–174 (2001)

    Google Scholar 

  15. Weiner, P.: Linear pattern matching algorithms. In: Proc. of the 14th IEEE Symposium on Switching and Automata Theory, pp. 1–11 (1973)

    Google Scholar 

  16. McCreight, E.M.: A space-economical suffix tree construction algorithm. Journal of the ACM (JACM) 23, 262–272 (1976)

    Article  MATH  MathSciNet  Google Scholar 

  17. Ukkonen, E.: On-line construction of suffix-trees. Algorithmica 14, 249–260 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  18. Hui, L.: Color set size problem with applications to string matching. In: Apostolico, A., Galil, Z., Manber, U., Crochemore, M. (eds.) CPM 1992. LNCS, vol. 644, pp. 227–240. Springer, Heidelberg (1992)

    Google Scholar 

  19. Harel, D., Tarjan, R.E.: Fast algorithms for finding nearest common ancestors. SIAM Journal of Computing 13, 338–355 (1984)

    Article  MATH  MathSciNet  Google Scholar 

  20. Schieber, B., Vishkin, U.: On finding lowest common ancestors: Simplificationsand parallelization. SIAM Journal of Computing 17, 1253–1262 (1988)

    Article  MATH  MathSciNet  Google Scholar 

  21. Knudsen, S.: A Biologist’s Guide to Analysis of DNA Microarray Data. Wiley Pub, Chichester (2002)

    Book  Google Scholar 

  22. Baldi, P., Hatfield, G.W.: DNA Microarrays and Gene Expression. Cambridge University Press, Cambridge (2002)

    Book  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Angelov, S., Harb, B., Kannan, S., Khanna, S., Kim, J., Wang, LS. (2004). Genome Identification and Classification by Short Oligo Arrays. In: Jonassen, I., Kim, J. (eds) Algorithms in Bioinformatics. WABI 2004. Lecture Notes in Computer Science(), vol 3240. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30219-3_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30219-3_34

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-23018-2

  • Online ISBN: 978-3-540-30219-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics