Abstract
The identification of disease genes from candidated regions is one of the most important tasks in bioinformatics research. Among all the approaches reported recently, methods based on sequence characteristics have the widest application range. However, their accuracies are usually low, because these methods take into account the overall differences between disease and non-disease gene, rather than specific characteristics. To tackle this problem, the statistical characteristics of the protein sequences between disease genes and non-disease genes have been analyzed. The analysis showed that the amino acids usage by a gene was similar to genes responsible for the same disease but remarkably different from others. An algorithm based on the amino acid usage characteristics was developed. And cross validation was performed for a set of 208 genes involved in 55 diseases with significant amino acid usage characteristics. The test demonstrated that, 15.4% target genes ranked first, and the target genes were in the top 5% with 44.2% chance. For those diseases with significant amino acid usage characteristics, this approach showed promising performance compared to other methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
McCarthy, M.I., Smedley, D., Hide, W.: New methods for finding disease-susceptibility genes: impact and potential. Genome Biol. 4(10), 119 (2003)
Perez-Iratxeta, C., Bork, P., Andrade-Navarro, M.A.: Update of the G2D tool for prioritization of gene candidates to inherited diseases. Nucleic Acids Res. 35(Web Server issue), W212-W216 (2007)
Turner, F.S., Clutterbuck, D.R., Semple, C.A.: POCUS: mining genomic sequence annotation to predict disease genes. Genome Biol. 4(11), R75 (2003)
Kondrashov, F.A., Ogurtsov, A.Y., Kondrashov, A.S.: Bioinformatical assay of human gene morbidity. Nucleic Acids Res. 32(5), 1731–1737 (2004)
Lopez-Bigas, N., Ouzounis, C.A.: Genome-wide identification of genes likely to be involved in human genetic disease. Nucleic Acids Res. 32(10), 3108–3114 (2004)
Adie, E.A., et al.: Speeding disease gene discovery by sequence based candidate prioritization. BMC Bioinformatics 6, 55 (2005)
Adie, E.A., et al.: SUSPECTS: enabling fast and effective prioritization of positional candidates. Bioinformatics 22(6), 773–774 (2006)
van Driel, M.A., et al.: A new web-based data mining tool for the identification of candidate genes for human genetic disorders. Eur. J. Hum. Genet. 11(1), 57–63 (2003)
van Driel, M.A., et al.: GeneSeeker: extraction and integration of human disease-related information from web-based genetic databases. Nucleic Acids Res. 33(Web Server issue), W758–W761 (2005)
Oti, M., et al.: Predicting disease genes using protein-protein interactions. J. Med. Genet. 43(8), 691–698 (2006)
George, R.A., et al.: Analysis of protein sequence and interaction data for candidate disease gene prediction. Nucleic Acids Res. 34(19), e130 (2006)
Franke, L., et al.: Reconstruction of a functional human gene network, with an application for prioritizing positional candidate genes. Am J. Hum. Genet. 78(6), 1011–1025 (2006)
Aerts, S., et al.: Gene prioritization through genomic data fusion. Nat. Biotechnol. 24(5), 537–544 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yuan, F., Li, J., Li, L. (2012). Indentifying Disease Genes Using Disease-Specific Amino Acid Usage. In: Huang, DS., Gan, Y., Premaratne, P., Han, K. (eds) Bio-Inspired Computing and Applications. ICIC 2011. Lecture Notes in Computer Science(), vol 6840. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24553-4_63
Download citation
DOI: https://doi.org/10.1007/978-3-642-24553-4_63
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24552-7
Online ISBN: 978-3-642-24553-4
eBook Packages: Computer ScienceComputer Science (R0)