Abstract
Orthologous groups are useful in the genome annotation, studies on gene evolution, and comparative genomics. However, the construction of orthologous groups is difficult to automate and takes so much time as the number of genome sequences increases. Furthermore, it is not easy to guarantee the accuracy of the automatically constructed orthologous groups. We propose an automatic orthologous group construction system for a large number of genomes. A hybrid grid computer system, consisting of 40 PCs, has been devised for fast construction of the orthologous groups from large number of genome sequences. The grid system constructs orthologous groups for 89 complete prokaryotes genomes just in a week (it takes 8 months on a single computer system). Furthermore, the system provides good extensibility for adopting new genomes in the existing orthologous groups. In the real experiment of the orthologous group constructions, more than 85% of the constructed orthologous groups coincide with those of KO (KEGG Ortholog) and COGs (Clusters of Orthologous Group of Proteins). Note that KO and COGs have been constructed manually or semi-automatically at the sacrifice of the extensibility for newly completed genomes.
This research was supported by the Program for the Training of Graduate Student in Regional Innovation which was conducted by the Ministry of Commerce, Industry and Energy of Korea Government.
This work was supported by the Regional Research Centers Program of the Ministry of Education & Human Resources Development in Korea.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Altschul, S.F., et al.: Basic Local Alignment Search Tool. Journal of Molecular Biology 215, 403–410 (1990)
Altschul, S.F., et al.: Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Research 25, 3389–3402 (1997)
Fitch, W.M.: Distinguishing homologous from analogous proteins. Syst. Zool. 19, 99–113 (1970)
Kanehisa, M., et al.: The KEGG resources for deciphering the genome. Nucleic Acids Res. 32, D277–D280 (2004)
Kim, T.K., et al.: HGBS: A Hardware-Oriented Grid BLAST System. In: Proc. of the 5th IEEE/ACM Int’l. Symposium on Cluster Computing and the Grid, BioGrid 2005 (2005)
Kuo, Y.L., et al.: Construct a Grid Computing Environment for Bioinformatics. In: Proc. of the International Symposium on Parallel Architectures, Algorithms and Networks(ISPAN 2004), pp. 1087–4089 (2004)
Lee, S.J., et al.: Exploring protein fold space by secondary structure prediction using data distribution method on Grid platform. Bioinformatics (Advance Access published on July 29, 2004)
Remm, M., et al.: Automatic clustering of orthologs and in-paralogs from pairwise species comparisons. J. Mol. Biol. 314, 1041–1052 (2001)
Tatusov, R.L., et al.: The COG Database: A Tool for Genomic-Scale Analysis of Protein Function and Evolution. Nucleic Acids Res. 28, 33–36 (1999)
Tatusov, R., et al.: A genomic perspective on protein families. Science 278, 631–637 (1997)
Tatusov, R., et al.: The COG database: new developments in phylogenetic classification of proteins from complete genomes. Nucleic Acids Res. 29, 22–28 (2001)
Tatusov, R.L., et al.: The COG database: a tool for genome-scale analysis of protein functions and evolution. Nucleic Acids Res. 28, 33–36 (2000)
Tatusov, R.L., et al.: The COG database: an updated version includes eukaryotes. BMC Bioinformatics 11(4), 41 (2003)
Wang, L., et al.: Biogrid Computing Platform: Parallel computing for protein alignment analysis. In: HPC Asia 2002, Bangalore, India (2002)
Yamanishi, Y., et al.: Extraction of Organism Groups from Whole Genome Comparisons. Genome Informatics 14, 438–439 (2003)
Yong-Meng, T.E.O., et al.: GLAD: a system for developing and deploying large-scale bioinformatics Grid. Bioinformatics (Advance Access published on September 23, 2004)
COGs official homepage, http://www.ncbi.nlm.nih.gov/COG/
KO official homepage, http://www.genome.jp/kegg/ko.html
KEGG, http://www.genome.ad.jp
EtherBoot Project, http://etherboot.sourceforge.net/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kim, TK., Kim, KR., Oh, SK., Lee, JH., Cho, WS. (2006). A Hybrid Grid and Its Application to Orthologous Groups Clustering . In: R. Berthold, M., Glen, R.C., Fischer, I. (eds) Computational Life Sciences II. CompLife 2006. Lecture Notes in Computer Science(), vol 4216. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11875741_2
Download citation
DOI: https://doi.org/10.1007/11875741_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45767-1
Online ISBN: 978-3-540-45768-8
eBook Packages: Computer ScienceComputer Science (R0)