Skip to main content

Advertisement

Log in

Efficient algorithms for identifying orthologous simple sequence repeats of disease genes

  • Published:
Journal of Systems Science and Complexity Aims and scope Submit manuscript

Abstract

Dynamic mutations of simple sequence repeats (SSRs) have been demonstrated to affect normal gene function and cause different genetic disorders. Several conserved and even partial functional SSR patterns are discovered in inherited orthologous disease genes. To explore a wide range of SSRs in genetic diseases, a comprehensive system focusing on identifying orthologous SSRs of disease genes through a comparative genomics mechanism is constructed and accomplished by adopting online Mendelian inheritance in man (OMIM) and NCBI HomoloGene databases as the fundamental resources of human genetic diseases and homologous gene information. In addition, an efficient and effective algorithm for searching SSR patterns is also developed for providing annotated SSR information among various model species. By integrating these data resources and mining technologies, biologists and doctors can systematically retrieve novel and important conserved SSR information among orthologous disease genes. The proposed system, Orthologous SSR for Disease Genes (OSDG), is the first comprehensive framework for identifying orthologous SSRs as potential causative factors of genetic disorders and is freely available at http://osdg.cs.ntou.edu.tw/.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. B. Charlesworth, P. Sniegowski, and W. Stephan, The evolutionary dynamics of repetitive dna in eukaryotes, Nature, 1994, 371: 215–220.

    Article  Google Scholar 

  2. P. C. Sharma, A. Grover, and G. Kahl, Mining microsatellites in eukaryotic genomes, Trends Biotechnol., 2007, 25: 490–498.

    Article  Google Scholar 

  3. A. Bacolla, J. E. Larson, J. R. Collins, et al., Abundance and length of simple repeats in vertebrate genomes are determined by their structural properties, Genome Res., 2008, 18: 1545–1553.

    Article  Google Scholar 

  4. J. Jurka and C. Pethiyagoda, Simple repetitive dna sequences from primates: Compilation and analysis, J. Mol. Evol., 1995, 40: 120–126.

    Article  Google Scholar 

  5. J. D. Wren, E. Forgacs, J. W. 3rd. Fondon, et al., Repeat polymorphisms within gene regions: Phenotypic and evolutionary implications, Am. J. Hum. Genet., 2000, 67: 345–356.

    Article  Google Scholar 

  6. F. Calafell, A. Shuster, W. C. Speed, et al., Short tandem repeat polymorphism evolution in humans, Eur. J. Hum. Genet., 1998, 6: 38–49.

    Article  Google Scholar 

  7. S. Subramanian, V. M. Madgula, R. George, et al., Triplet repeats in human genome: Distribution and their association with genes and other genomic regions, Bioinformatics, 2003, 19: 549–552.

    Article  Google Scholar 

  8. Y. Li, A. B. Korol, T. Fahima, and E. Nevo, Microsatellites within genes: Structure, function, and evolution, Mol. Biol. Evol., 2004, 21: 991–1007.

    Article  Google Scholar 

  9. Genetic disease information. URL: http://www.ornl.gov/sci/techresources/HumanGenome/medicine/assist.shtml.

  10. J. N. Hirschhorn, K. Lohmueller, E. Byrne, and K. Hirschhorn, A comprehensive review of genetic association studies, Genet. Med., 2002, 4: 45–61.

    Article  Google Scholar 

  11. G. R. Sutherland and R. I. Richards, Simple tandem dna repeats and human genetic disease, Proc. Natl. Acad. Sci. USA, 1995, 92: 3636–3641.

    Article  Google Scholar 

  12. R. I. Richards, K. Holman, S. Yu, and G. R. Sutherland, Fragile x syndrome unstable element, p(ccg)n, and other simple tandem repeat sequences are binding sites for specific nuclear proteins, Hum. Mol. Genet., 1993, 2: 1429–1435.

    Article  Google Scholar 

  13. J. F. Gusella and M. E. Macdonald, Huntington’s disease: seeing the pathogenic process through a genetic lens, Trends Biochem. Sci., 2006, 31: 533–540.

    Article  Google Scholar 

  14. M. Perucho, Microsatellite instability: The mutator that mutates the other mutator, Nat. Med., 1996, 2: 630–631.

    Article  Google Scholar 

  15. Y. Kashi and D. G. King, Simple sequence repeats as advantageous mutators in evolution, Trends Genet., 2006, 22: 253–259.

    Article  Google Scholar 

  16. G. Toth, Z. Gaspari, and J. Jurka, Microsatellites in different eukaryotic genomes: Survey and analysis, Genome Res., 2000, 10: 967–981.

    Article  Google Scholar 

  17. A. Alexeyenko, J. Lindberg, A. Perez-Bercoff, and E. L. Sonnhammer, Overview and comparison of ortholog databases, Drug Discovery Today: Technologies, 2006, 3: 137–143.

    Article  Google Scholar 

  18. E. Sonnhammer and E. V. Koonin, Orthology, paralogy and proposed classification for paralog subtypes, Trends Genet., 2002, 18: 619–620.

    Article  Google Scholar 

  19. A. E. Guttmacher and F. S. Collins, Genomic medicine-A primer, N. Engl. J. Med., 2002, 347: 1512–1520.

    Article  Google Scholar 

  20. Online mendelian inheritance in man, omim (tm). URL: http://www.ncbi.nlm.nih.gov/omim/, 2008/12/25.

  21. Homologene. URL: http://www.ncbi.nlm.nih.gov/sites/entrez?db=homologene.

  22. T. W. Pai, C. M. Chen, M. C. Hsiao, et al., An online conserved ssr discovery through cross-species comparison, Advances and Applications in Bioinformatics and Chemistry, 2009, 2: 23–35.

    Article  Google Scholar 

  23. T. Boby, A. Patch, and S. J. Aves, Trbase: A database relating tandem repeats to disease genes for the human genome, Bioinformatics, 2005, 21: 811–816.

    Article  Google Scholar 

  24. K. P. O’Brien, I. Westerlund, and E. Sonnhammer, Orthodisease: A database of human disease orthologs, Hum. Mutat., 2004, 24: 112–119.

    Article  Google Scholar 

  25. A. Hamosh, A. F. Scott, J. S. Amberger, et al., Online mendelian inheritance in man (omim), a knowledgebase of human genes and genetic disorders, Nucleic Acids Res., 2005, 33: D514–517.

    Article  Google Scholar 

  26. T. J. P. Hubbard, B. L. Aken, S. Ayling, et al., Ensembl 2009, Nucleic Acids Res., 2009, 37: D690–697.

    Article  Google Scholar 

  27. C. M. Chen, W. S. Tzou, T. H. Shih, et al., Identification of conserved simple sequence repeats from orthologous disease genes, World Congress in Computer Science, Computer Engineering, and Applied Computing, 2009, I: 129–133.

    Google Scholar 

  28. S. E. Andrew, Y. P. Goldberg, B. Kremer, et al., The relationship between trinucleotide (cag) repeat length and clinical features of huntington’s disease, Nat. Genet., 1993, 4: 398–403.

    Article  Google Scholar 

  29. K. Kieburtz, M. MacDonald, C. Shih, et al., Trinucleotide repeat length and progression of illness in huntington’s disease, J. Med. Genet., 1994, 31: 872–874.

    Article  Google Scholar 

  30. G. A. Singer and D. A. Hickey, Nucleotide bias causes a genomewide bias in the amino acid composition of proteins, Mol. Biol. Evol., 2000, 17: 1581–1588.

    Google Scholar 

  31. F. Naumann, H. Muller-Hartmann, H. Deissler, and W. Doerfler, On the function of the cgg-binding protein, Gene Function and Disease, 2001, 2(2–3): 89–94.

    Article  Google Scholar 

  32. Sputnik. URL: http://espressosoftware.com/sputnik/index.html, 1994.

  33. G. Benson, Tandem repeats finder: A program to analyze dna sequences, Nucleic Acids Res., 1999, 27: 573–580.

    Article  Google Scholar 

  34. V. Parisi, V. De Fonzo, and F. Aluffi-Pentini, String: Finding tandem repeats in dna sequences, Bioinformatics, 2003, 19: 1733–1738.

    Article  Google Scholar 

  35. R. Kolpakov, G. Bana, and G. Kucherov, Mreps: Efficient and flexible detection of tandem repeats in dna, Nucleic Acids Res., 2003, 31: 3672–3678.

    Article  Google Scholar 

  36. Y. Wexler, Z. Yakhini, Y. Kashi, and D. Geiger, Finding approximate tandem repeats in genomic sequences, Recomb’04: Proceedings of the Eighth Annual International Conference on Resaerch in Computational Molecular Biology, 2004: 223–232.

  37. Msatfinder: Detection and characterisation of microsatellites. URL: http://www.genomics.ceh.ac.uk/msatfinder/, 2005.

  38. V. Boeva, M. Regnier, D. Papatsenko, and V. Makeev, Short fuzzy tandem repeats in genomic sequences, identification, and possible role in regulation of gene expression, Bioinformatics, 2006, 22: 676–684.

    Article  Google Scholar 

  39. R. Kofler, C. Schlotterer, and T. Lelley, Sciroko: A new tool for whole genome microsatellite search and investigation, Bioinformatics, 2007, 23: 1683–1685.

    Article  Google Scholar 

  40. S. B. Mudunuri and H. A. Nagarajaram, Imex: Imperfect microsatellite extractor, Bioinformatics, 2007, 23: 1181–1187.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tunwen Pai.

Additional information

This research is supported by the Center for Marine Bioenvironment and Biotechnology (CMBB) in National Taiwan Ocean University, Keelung, Taiwan, and the National Science Council in Taiwan (NSC97-2627-B-019-003).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, C., Chen, C., Shih, T. et al. Efficient algorithms for identifying orthologous simple sequence repeats of disease genes. J Syst Sci Complex 23, 906–916 (2010). https://doi.org/10.1007/s11424-010-0203-2

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11424-010-0203-2

Key words

Navigation