Abstract
We present a simple method to obtain groups of homologous genes across multiple (k) organisms, called kGC. It takes all-against-all BLASTP comparisons as input and produces groups of homologous sequences as output. The algorithm is based on the identification of maximal cliques in graphs of sequences and paralogous groups. We have used our method on six Actinobacterial complete genomes and investigated the Pfam classification of the homologous groups with respect to the results produced by OrthoMCL. Although kGC is simpler, it presented similar results with respect to Pfam classification in reasonable time.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bateman, A., Coin, L., Durbin, R., Finn, R.D.: The Pfam protein families database. Nucleic Acids Res. 32, D138–D141 (2004)
Alexeyenko, A., Tamas, I., et al.: Automatic clustering of orthologs and inparalogs shared by multiple proteomes. Bioinform. 22, e9–e15 (2006)
Almeida, N.F.: Tools for genome comparison. PhD thesis, Instituto de Computação–Unicamp (2002) (in Portuguese)
Anjos, D.A.S., Zerlotini, G., et al.: A method for inferring biological functions using homologous genes among three genomes. In: Sagot, M.-F., Walter, M.E.M.T. (eds.) BSB 2007. LNCS (LNBI), vol. 4643, pp. 69–80. Springer, Heidelberg (2007)
Bron, C., Kerbosch, J.: Algorithm 457: finding all cliques of an undirected graph. Comm. of the ACM 16, 575–577 (1973)
Cannon, S.B., Young, N.B.: Orthoparamap: Distinguishing orthologs from paralogs by integrating comparative genome data and gene phylogenies. BioMed Central Bioinform. 4, 35 (2003)
Chen, F., Mackey, A.J., et al.: Assessing performance of orthology detection strategies applied to eukaryotic genomes. PLoS ONEÂ 2(4), e383 (2007)
Kellis, M., Patterson, N., Birren, B., Berger, B., Lander, E.S.: Methods in comparative genomics: genome correspondence, gene identification and motif discovery. Bioinform. 11(2-3), 319–355 (2004)
Lee, Y., Sultana, R., et al.: Cross-referencing eukaryotic genomes: TIGR orthologous gene alignments (TOGA). Genome Res. 12(3), 493–502 (2002)
Li, L., Stoeckert Jr., C.J., Roos, D.S.: OrthoMCL: Identification of ortholog groups for eukaryotic genomes. Genome Res. 13(9), 2178–2189 (2003)
Remm, M., Storm, C.E., Sonnhammer, E.: Automatic clustering of orthologs and in-paralogs from pairwise species comparisons. Journal of Molecular Biology 314, 1041–1052 (2001)
Tatusov, R.L., Fedorova, N.D., et al.: The COG database: an updated version includes eukaryotes. BMC Bioinform. 4, 41 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Telles, G.P., Almeida, N.F., Brigido, M.M., Alvarez, P.A., Walter, M.E. (2011). kGC: Finding Groups of Homologous Genes across Multiple Genomes. In: Norberto de Souza, O., Telles, G.P., Palakal, M. (eds) Advances in Bioinformatics and Computational Biology. BSB 2011. Lecture Notes in Computer Science(), vol 6832. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22825-4_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-22825-4_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-22824-7
Online ISBN: 978-3-642-22825-4
eBook Packages: Computer ScienceComputer Science (R0)