Skip to main content

A Hybrid Grid and Its Application to Orthologous Groups Clustering

  • Conference paper
Computational Life Sciences II (CompLife 2006)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 4216))

Included in the following conference series:

Abstract

Orthologous groups are useful in the genome annotation, studies on gene evolution, and comparative genomics. However, the construction of orthologous groups is difficult to automate and takes so much time as the number of genome sequences increases. Furthermore, it is not easy to guarantee the accuracy of the automatically constructed orthologous groups. We propose an automatic orthologous group construction system for a large number of genomes. A hybrid grid computer system, consisting of 40 PCs, has been devised for fast construction of the orthologous groups from large number of genome sequences. The grid system constructs orthologous groups for 89 complete prokaryotes genomes just in a week (it takes 8 months on a single computer system). Furthermore, the system provides good extensibility for adopting new genomes in the existing orthologous groups. In the real experiment of the orthologous group constructions, more than 85% of the constructed orthologous groups coincide with those of KO (KEGG Ortholog) and COGs (Clusters of Orthologous Group of Proteins). Note that KO and COGs have been constructed manually or semi-automatically at the sacrifice of the extensibility for newly completed genomes.

This research was supported by the Program for the Training of Graduate Student in Regional Innovation which was conducted by the Ministry of Commerce, Industry and Energy of Korea Government.

This work was supported by the Regional Research Centers Program of the Ministry of Education & Human Resources Development in Korea.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Altschul, S.F., et al.: Basic Local Alignment Search Tool. Journal of Molecular Biology 215, 403–410 (1990)

    Google Scholar 

  2. Altschul, S.F., et al.: Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Research 25, 3389–3402 (1997)

    Article  Google Scholar 

  3. Fitch, W.M.: Distinguishing homologous from analogous proteins. Syst. Zool. 19, 99–113 (1970)

    Article  Google Scholar 

  4. Kanehisa, M., et al.: The KEGG resources for deciphering the genome. Nucleic Acids Res. 32, D277–D280 (2004)

    Article  Google Scholar 

  5. Kim, T.K., et al.: HGBS: A Hardware-Oriented Grid BLAST System. In: Proc. of the 5th IEEE/ACM Int’l. Symposium on Cluster Computing and the Grid, BioGrid 2005 (2005)

    Google Scholar 

  6. Kuo, Y.L., et al.: Construct a Grid Computing Environment for Bioinformatics. In: Proc. of the International Symposium on Parallel Architectures, Algorithms and Networks(ISPAN 2004), pp. 1087–4089 (2004)

    Google Scholar 

  7. Lee, S.J., et al.: Exploring protein fold space by secondary structure prediction using data distribution method on Grid platform. Bioinformatics (Advance Access published on July 29, 2004)

    Google Scholar 

  8. Remm, M., et al.: Automatic clustering of orthologs and in-paralogs from pairwise species comparisons. J. Mol. Biol. 314, 1041–1052 (2001)

    Article  Google Scholar 

  9. Tatusov, R.L., et al.: The COG Database: A Tool for Genomic-Scale Analysis of Protein Function and Evolution. Nucleic Acids Res. 28, 33–36 (1999)

    Article  Google Scholar 

  10. Tatusov, R., et al.: A genomic perspective on protein families. Science 278, 631–637 (1997)

    Article  Google Scholar 

  11. Tatusov, R., et al.: The COG database: new developments in phylogenetic classification of proteins from complete genomes. Nucleic Acids Res. 29, 22–28 (2001)

    Article  Google Scholar 

  12. Tatusov, R.L., et al.: The COG database: a tool for genome-scale analysis of protein functions and evolution. Nucleic Acids Res. 28, 33–36 (2000)

    Article  Google Scholar 

  13. Tatusov, R.L., et al.: The COG database: an updated version includes eukaryotes. BMC Bioinformatics 11(4), 41 (2003)

    Article  Google Scholar 

  14. Wang, L., et al.: Biogrid Computing Platform: Parallel computing for protein alignment analysis. In: HPC Asia 2002, Bangalore, India (2002)

    Google Scholar 

  15. Yamanishi, Y., et al.: Extraction of Organism Groups from Whole Genome Comparisons. Genome Informatics 14, 438–439 (2003)

    Google Scholar 

  16. Yong-Meng, T.E.O., et al.: GLAD: a system for developing and deploying large-scale bioinformatics Grid. Bioinformatics (Advance Access published on September 23, 2004)

    Google Scholar 

  17. COGs official homepage, http://www.ncbi.nlm.nih.gov/COG/

  18. KO official homepage, http://www.genome.jp/kegg/ko.html

  19. NCBI, http://www.ncbi.nlm.nih.gov/

  20. KEGG, http://www.genome.ad.jp

  21. EtherBoot Project, http://etherboot.sourceforge.net/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kim, TK., Kim, KR., Oh, SK., Lee, JH., Cho, WS. (2006). A Hybrid Grid and Its Application to Orthologous Groups Clustering . In: R. Berthold, M., Glen, R.C., Fischer, I. (eds) Computational Life Sciences II. CompLife 2006. Lecture Notes in Computer Science(), vol 4216. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11875741_2

Download citation

  • DOI: https://doi.org/10.1007/11875741_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-45767-1

  • Online ISBN: 978-3-540-45768-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics