Efficient algorithms for identifying orthologous simple sequence repeats of disease genes

Chen, Chienming; Chen, Chihchia; Shih, Tsanhuang; Pai, Tunwen; Hu, Chinhua; Tzou, Wenshyong

doi:10.1007/s11424-010-0203-2

Efficient algorithms for identifying orthologous simple sequence repeats of disease genes

Published: 09 November 2010

Volume 23, pages 906–916, (2010)
Cite this article

Journal of Systems Science and Complexity Aims and scope Submit manuscript

Chienming Chen¹,
Chihchia Chen¹,
Tsanhuang Shih¹,
Tunwen Pai¹,
Chinhua Hu² &
…
Wenshyong Tzou²

81 Accesses
4 Citations
Explore all metrics

Abstract

Dynamic mutations of simple sequence repeats (SSRs) have been demonstrated to affect normal gene function and cause different genetic disorders. Several conserved and even partial functional SSR patterns are discovered in inherited orthologous disease genes. To explore a wide range of SSRs in genetic diseases, a comprehensive system focusing on identifying orthologous SSRs of disease genes through a comparative genomics mechanism is constructed and accomplished by adopting online Mendelian inheritance in man (OMIM) and NCBI HomoloGene databases as the fundamental resources of human genetic diseases and homologous gene information. In addition, an efficient and effective algorithm for searching SSR patterns is also developed for providing annotated SSR information among various model species. By integrating these data resources and mining technologies, biologists and doctors can systematically retrieve novel and important conserved SSR information among orthologous disease genes. The proposed system, Orthologous SSR for Disease Genes (OSDG), is the first comprehensive framework for identifying orthologous SSRs as potential causative factors of genetic disorders and is freely available at http://osdg.cs.ntou.edu.tw/.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Identification of conserved and polymorphic STRs for personal genomes

Article Open access 12 December 2014

CRISPR: a versatile tool for both forward and reverse genetics research

Article 07 July 2016

Repetitive DNA sequence detection and its role in the human genome

Article Open access 19 September 2023

References

B. Charlesworth, P. Sniegowski, and W. Stephan, The evolutionary dynamics of repetitive dna in eukaryotes, Nature, 1994, 371: 215–220.
Article Google Scholar
P. C. Sharma, A. Grover, and G. Kahl, Mining microsatellites in eukaryotic genomes, Trends Biotechnol., 2007, 25: 490–498.
Article Google Scholar
A. Bacolla, J. E. Larson, J. R. Collins, et al., Abundance and length of simple repeats in vertebrate genomes are determined by their structural properties, Genome Res., 2008, 18: 1545–1553.
Article Google Scholar
J. Jurka and C. Pethiyagoda, Simple repetitive dna sequences from primates: Compilation and analysis, J. Mol. Evol., 1995, 40: 120–126.
Article Google Scholar
J. D. Wren, E. Forgacs, J. W. 3rd. Fondon, et al., Repeat polymorphisms within gene regions: Phenotypic and evolutionary implications, Am. J. Hum. Genet., 2000, 67: 345–356.
Article Google Scholar
F. Calafell, A. Shuster, W. C. Speed, et al., Short tandem repeat polymorphism evolution in humans, Eur. J. Hum. Genet., 1998, 6: 38–49.
Article Google Scholar
S. Subramanian, V. M. Madgula, R. George, et al., Triplet repeats in human genome: Distribution and their association with genes and other genomic regions, Bioinformatics, 2003, 19: 549–552.
Article Google Scholar
Y. Li, A. B. Korol, T. Fahima, and E. Nevo, Microsatellites within genes: Structure, function, and evolution, Mol. Biol. Evol., 2004, 21: 991–1007.
Article Google Scholar
Genetic disease information. URL: http://www.ornl.gov/sci/techresources/HumanGenome/medicine/assist.shtml.
J. N. Hirschhorn, K. Lohmueller, E. Byrne, and K. Hirschhorn, A comprehensive review of genetic association studies, Genet. Med., 2002, 4: 45–61.
Article Google Scholar
G. R. Sutherland and R. I. Richards, Simple tandem dna repeats and human genetic disease, Proc. Natl. Acad. Sci. USA, 1995, 92: 3636–3641.
Article Google Scholar
R. I. Richards, K. Holman, S. Yu, and G. R. Sutherland, Fragile x syndrome unstable element, p(ccg)n, and other simple tandem repeat sequences are binding sites for specific nuclear proteins, Hum. Mol. Genet., 1993, 2: 1429–1435.
Article Google Scholar
J. F. Gusella and M. E. Macdonald, Huntington’s disease: seeing the pathogenic process through a genetic lens, Trends Biochem. Sci., 2006, 31: 533–540.
Article Google Scholar
M. Perucho, Microsatellite instability: The mutator that mutates the other mutator, Nat. Med., 1996, 2: 630–631.
Article Google Scholar
Y. Kashi and D. G. King, Simple sequence repeats as advantageous mutators in evolution, Trends Genet., 2006, 22: 253–259.
Article Google Scholar
G. Toth, Z. Gaspari, and J. Jurka, Microsatellites in different eukaryotic genomes: Survey and analysis, Genome Res., 2000, 10: 967–981.
Article Google Scholar
A. Alexeyenko, J. Lindberg, A. Perez-Bercoff, and E. L. Sonnhammer, Overview and comparison of ortholog databases, Drug Discovery Today: Technologies, 2006, 3: 137–143.
Article Google Scholar
E. Sonnhammer and E. V. Koonin, Orthology, paralogy and proposed classification for paralog subtypes, Trends Genet., 2002, 18: 619–620.
Article Google Scholar
A. E. Guttmacher and F. S. Collins, Genomic medicine-A primer, N. Engl. J. Med., 2002, 347: 1512–1520.
Article Google Scholar
Online mendelian inheritance in man, omim (tm). URL: http://www.ncbi.nlm.nih.gov/omim/, 2008/12/25.
Homologene. URL: http://www.ncbi.nlm.nih.gov/sites/entrez?db=homologene.
T. W. Pai, C. M. Chen, M. C. Hsiao, et al., An online conserved ssr discovery through cross-species comparison, Advances and Applications in Bioinformatics and Chemistry, 2009, 2: 23–35.
Article Google Scholar
T. Boby, A. Patch, and S. J. Aves, Trbase: A database relating tandem repeats to disease genes for the human genome, Bioinformatics, 2005, 21: 811–816.
Article Google Scholar
K. P. O’Brien, I. Westerlund, and E. Sonnhammer, Orthodisease: A database of human disease orthologs, Hum. Mutat., 2004, 24: 112–119.
Article Google Scholar
A. Hamosh, A. F. Scott, J. S. Amberger, et al., Online mendelian inheritance in man (omim), a knowledgebase of human genes and genetic disorders, Nucleic Acids Res., 2005, 33: D514–517.
Article Google Scholar
T. J. P. Hubbard, B. L. Aken, S. Ayling, et al., Ensembl 2009, Nucleic Acids Res., 2009, 37: D690–697.
Article Google Scholar
C. M. Chen, W. S. Tzou, T. H. Shih, et al., Identification of conserved simple sequence repeats from orthologous disease genes, World Congress in Computer Science, Computer Engineering, and Applied Computing, 2009, I: 129–133.
Google Scholar
S. E. Andrew, Y. P. Goldberg, B. Kremer, et al., The relationship between trinucleotide (cag) repeat length and clinical features of huntington’s disease, Nat. Genet., 1993, 4: 398–403.
Article Google Scholar
K. Kieburtz, M. MacDonald, C. Shih, et al., Trinucleotide repeat length and progression of illness in huntington’s disease, J. Med. Genet., 1994, 31: 872–874.
Article Google Scholar
G. A. Singer and D. A. Hickey, Nucleotide bias causes a genomewide bias in the amino acid composition of proteins, Mol. Biol. Evol., 2000, 17: 1581–1588.
Google Scholar
F. Naumann, H. Muller-Hartmann, H. Deissler, and W. Doerfler, On the function of the cgg-binding protein, Gene Function and Disease, 2001, 2(2–3): 89–94.
Article Google Scholar
Sputnik. URL: http://espressosoftware.com/sputnik/index.html, 1994.
G. Benson, Tandem repeats finder: A program to analyze dna sequences, Nucleic Acids Res., 1999, 27: 573–580.
Article Google Scholar
V. Parisi, V. De Fonzo, and F. Aluffi-Pentini, String: Finding tandem repeats in dna sequences, Bioinformatics, 2003, 19: 1733–1738.
Article Google Scholar
R. Kolpakov, G. Bana, and G. Kucherov, Mreps: Efficient and flexible detection of tandem repeats in dna, Nucleic Acids Res., 2003, 31: 3672–3678.
Article Google Scholar
Y. Wexler, Z. Yakhini, Y. Kashi, and D. Geiger, Finding approximate tandem repeats in genomic sequences, Recomb’04: Proceedings of the Eighth Annual International Conference on Resaerch in Computational Molecular Biology, 2004: 223–232.
Msatfinder: Detection and characterisation of microsatellites. URL: http://www.genomics.ceh.ac.uk/msatfinder/, 2005.
V. Boeva, M. Regnier, D. Papatsenko, and V. Makeev, Short fuzzy tandem repeats in genomic sequences, identification, and possible role in regulation of gene expression, Bioinformatics, 2006, 22: 676–684.
Article Google Scholar
R. Kofler, C. Schlotterer, and T. Lelley, Sciroko: A new tool for whole genome microsatellite search and investigation, Bioinformatics, 2007, 23: 1683–1685.
Article Google Scholar
S. B. Mudunuri and H. A. Nagarajaram, Imex: Imperfect microsatellite extractor, Bioinformatics, 2007, 23: 1181–1187.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, National Taiwan Ocean University, Keelung, Taiwan
Chienming Chen, Chihchia Chen, Tsanhuang Shih & Tunwen Pai
Institute of Bioscience and Biotechnology, National Taiwan Ocean University, Keelung, Taiwan
Chinhua Hu & Wenshyong Tzou

Authors

Chienming Chen
View author publications
You can also search for this author in PubMed Google Scholar
Chihchia Chen
View author publications
You can also search for this author in PubMed Google Scholar
Tsanhuang Shih
View author publications
You can also search for this author in PubMed Google Scholar
Tunwen Pai
View author publications
You can also search for this author in PubMed Google Scholar
Chinhua Hu
View author publications
You can also search for this author in PubMed Google Scholar
Wenshyong Tzou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tunwen Pai.

Additional information

This research is supported by the Center for Marine Bioenvironment and Biotechnology (CMBB) in National Taiwan Ocean University, Keelung, Taiwan, and the National Science Council in Taiwan (NSC97-2627-B-019-003).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, C., Chen, C., Shih, T. et al. Efficient algorithms for identifying orthologous simple sequence repeats of disease genes. J Syst Sci Complex 23, 906–916 (2010). https://doi.org/10.1007/s11424-010-0203-2

Download citation

Received: 02 December 2009
Revised: 16 March 2010
Published: 09 November 2010
Issue Date: October 2010
DOI: https://doi.org/10.1007/s11424-010-0203-2

Key words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient algorithms for identifying orthologous simple sequence repeats of disease genes

Abstract

Access this article

Similar content being viewed by others

Identification of conserved and polymorphic STRs for personal genomes

CRISPR: a versatile tool for both forward and reverse genetics research

Repetitive DNA sequence detection and its role in the human genome

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Key words

Navigation

Efficient algorithms for identifying orthologous simple sequence repeats of disease genes

Abstract

Access this article

Similar content being viewed by others

Identification of conserved and polymorphic STRs for personal genomes

CRISPR: a versatile tool for both forward and reverse genetics research

Repetitive DNA sequence detection and its role in the human genome

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

Search

Navigation