ABSTRACT
We describe a novel analytical approach to gene recognition based on cross-species comparison We first undertook a comparison of orthologous genomic look from human and mouse, studying the extent of similarity in the number, size and sequence of exons and introns We then developed an approach for recognizing genes within such orthologous regions, by first aligning the regions using an iterative global alignment system and then identifying genes based on conservation of exonic features at aligned positions in both species The alignment and gene recognition are performed by new programs called GLASS and ROSETTA, respectively ROSETTA performed well at exact identification of coding exons in 117 orthologous pairs tested.
- 1.Allard, W. J., I. S. SIgal, and R A. F. Dixon. 1987. Sequence of the gene encoding the human M I muscarinlc acetylcholine receptor Nucleic Aculs Research. 15(24): 10604.Google Scholar
- 2.Antequera, R. and A. Bird 1993. Number of CpG islands and genes in human and mouse. Geneucs. 90: 11995-11999.Google Scholar
- 3.Bassett, D. E., M. S. Boguskl, F. Spencer, R Reeves, S H Klm, T Weaver, and P. Hieter. 1997 Genome Cross-Rel:krencmg and XREFDB- Imphcanons for the ~dentificat~on and analys~s of' genes mutated tn Human Disease. Nature Genettcs 15(4): 339-344Google ScholarCross Ref
- 4.Bogusto, M S and W Makalowskl 1998 Evolutionary parameters or' the transcribed mammalian genome - an analysis, of 2,820 orthotogous rodent and human sequcnccs Proceedings of the ,attonal Academy of Sc~ettce 95(16): 9407-9412.Google Scholar
- 5.Boguskl, M. S., D. R. Cox, and R. M. Myers. 1996. Genomes and evolution- Overview Current Opinion tn Genetics & Development. 6(6): 683-685Google Scholar
- 6.Burge, C. 1997. ldenuficat~on of genes m human genom~c DNA Phd d~ssertatlon, Stanford University, Department of Mathematics.Google Scholar
- 7.Burge, C. and S Karhn. 1997 Prediction of complete gene structures t n human genom~c DNA. Journal of Molecular Bioogy. 268:78-94Google ScholarCross Ref
- 8.Burset, M and R. Guigo. t996 Evaluatlon of gene structure predlcuon programs Genomtcs. 34(3): 353- 367.Google Scholar
- 9.Celeste, A. J, V. Rosen, I. L. Buecker, R. Kriz, E. A Wang, and J M. Wozney. 1986. Isolatmn of the human gene tbr bone gla prote~n utilizing mouse and rat cDNA clones. The EMBO Journal. 5(8): 1885- 1890.Google ScholarCross Ref
- 10.Dayhoff, M., R. M. Schwartz, and B. C. Orcutt. 1978. A model of evolutionary change ~n proteins. Atlas of Protein Sequence and Structure 5: 345-352.Google Scholar
- 11.Delphm, M. E., P. A. Stockwell, W. P. Tate, and C. M. Brown. 1999. Transterm, the translanonal signal database, extended to include full coding sequence and untranslated regions. Nucleic Acids Research. 27: 293-294.Google ScholarCross Ref
- 12.Gelfand, M. S., A. A. M!ronov and P. A. Pevzner. 1996. Gene recognition via spliced sequence alignment. P, oc. Nail Acad. Scl USA. 93: 9061- 9O66.Google ScholarCross Ref
- 13.Hardison, R. C, J. Oeltjen, and Webb Miller. 1997. Long Human-Mouse Sequence Alignments Reveal Novel Regulatory Elements. A Reason to Sequence the Mouse Genome. Genome Research. 7: 959-966.Google ScholarCross Ref
- 14.Heusel, J W, E. M Scarpatl. N. A. Enklns, D j. Gilbert, N G. Copetand, S. D. Shapiro, and T. J. Ley 1993 ,It~!c,~ular clonln- c. ~',~,no,o~nal location, and ussue-spec~fic express~on of the munne catheps~n G gene. Blood. 81(6): 16t4-t623Google Scholar
- 15.Jang, W, A. Hua, S V. Spilson, W Miller, B. A. Roe, and M. H. Meisler. 1999. Comparanve Sequence of Human and Mouse BAC Clones from the rand2 Region of Chromosome 2p13. Genome Research 9.53-6 I.Google Scholar
- 16.Keep, B F. 1995. Human and rodent DNA sequence comparisons, a mosmc model of human evolution TIG 11.9: 369-379.Google Scholar
- 17.Keep, B F. and L Hood. 1994 Striking sequence s~mdanty over almost I00 kdobases of human and mouse T-cell receptor DNA. Nature Genetics 7: 48- 53.Google ScholarCross Ref
- 18.Kumar, A., A Toscam, S. Rane, and E. P Reddy. 1996. Structural orgamzation and chromosomal mapping of JAK3 locus Ottcogetze. 13(9): 2009- 2014Google Scholar
- 19.Lander, E S. 1997. The new genomlcs- global views of biology. Sctence. 274(5287)' 536-539.Google Scholar
- 20.Lamerdln, J. E., M. A. Montgomery, S. A Stdwagen, L. K. Scheldecker, R S Tebbs, K. W. Brookman, L H Thompson, and A V. Carrano 1995 Genomm Sequence Comparison of the Human and Mouse XRCCI DNA Repair Gene Regions. Genomics 25' 547-554.Google Scholar
- 21.Leslie, N. D., E. B. Immerman, J. E. Flach, M. F!orez, J. L. Fridovich-Keil, and L. Elsas. 1992. The human galactose--I phosphate undyitransferase gene. Genomics. 14: 474-480.Google ScholarCross Ref
- 22.Lewin, B. 1996. Genes VI. Ox/brd University PressGoogle Scholar
- 23.Lukashln A. V. and M. Borodovsky. 1998. GENEMARK.HMM. new solutions for gene l'indlng Nucleic Actds Research. 26(4): 1107-1115Google Scholar
- 24.Makalowsk~, W. and M. S Bogusk,. 1998. S3n~nymuus and Nonsynonymous Substitution D~stances are Correlated m Mouse and Rat Genes Jo~:r, al of Molectllar Evohltio,! 47(2): I 19-12 !.Google Scholar
- 25.Makalov, sk~. W., I. Zhang. and M. S. Boguskl 1996. Comparative analys~s of } 196 orthologous mouse and human full-length mRNA and protein sequences. Genome Res. 6: 8456-857.Google Scholar
- 26.Makalowski, W. and M. S. Boguski. 1998 Evolutionary parameters of the transcribed mammalian genome' An analysis o1' 2.820 orthologous rodent and human sequences. Proc Natl Acad. Set 95: 9407-9412.Google ScholarCross Ref
- 27.Nagata. S., M Tsuch,ya, S Asano, O. Yamamoto, Y. H,rata, N. Kubota, M. Oheda, H. Nomura, and T Yamazak~. 1986. The chromosomal gene structure and tho mRNAs R~r human granulocyte colonystimulat,ng factor. The EMBO Jottrnal. 5(3): 575- 581.Google ScholarCross Ref
- 28.Oeltjen, I C, T Malley, D M. Muzny, W. Miller, R. A Gibbs, and J W. Belmont t997. Large Scale Comparative Sequence Analysis of the Human and Munne Bruton's Tyrosme Ktnase Loci Reveals Conserved Regulatory Domains. Genome Research. 7:315-329.Google Scholar
- 29.Pachter. L.S. 1999. Domino tlllng, Gene recognmon, and Mice. Ph.D. dissertation, MIT, Department of M~tthematics. Google ScholarDigital Library
- 30.Riedy, M. C, A. S. Dun'a, T. B Blake, W. Mode, B. L Lai, J. Davts, A. Bosse, J. J. O'Shea, and J. A. Johnston. 1996. Genomic sequence, orgamzatmn, and chromosomal localization of human JAK3. Genomtcs. 37' 57-61.Google Scholar
- 31.Shehee, W. R., L. Loeb, N. B. Adey, F H. Burton,n N. C Casavant, P. Cole, C. J, Davies, R. A. McGraw, S. A Sch~chman, D. M. Severynse, C F. Voliva, F. W. Wey~er. G B. Wisely. M. H. Edgell, and C. A. Hutchison Iii. 1989. Nucleotide Sequence of the B ALB/c Mouse 13-Globln Complex. Journal oJ' Molecular Biology. 205' 4 t-62.Google ScholarCross Ref
- 32.Yoneda. K, D Hohl, O. W. McBride, M. Wang, K. U. Cehrs, W W. Idler, and p. M. Stelnert. 1992. The human Ioricrin gene. Tile Journal of Bfotogical Chemistry 267(25): 18060-18066Google Scholar
- 33.Zhang, J and T. L. Madden. 1997. PowerBLAST: A New Network BLAST Application for Interactive or Automated Sequence Analysis and Annotation. Genome Research 7:649-656Google ScholarCross Ref
- 34.ht tp- //'w'~,,,,. ncbi. ntm. nih. _cJoV/. 1998. National Center for Biology Informatton webpage.Google Scholar
- Human and mouse gene structure: comparative analysis and application to exon prediction
Recommendations
A new approach for gene prediction using comparative sequence analysis
SAC '05: Proceedings of the 2005 ACM symposium on Applied computingThe availability of large fragments of genomic DNA makes it possible to apply comparative genomics for identification of protein-coding regions. In this work, a comparative analysis is conducted on homologous genomic sequences of organisms with ...
Phylogenetically and spatially conserved word pairs associated with gene expression changes in yeasts
RECOMB '03: Proceedings of the seventh annual international conference on Research in computational molecular biologyBackground. Transcriptional regulation in eukaryotes is often multifactorial, involving multiple transcription factors binding to the same transcription control region (e.g., upstream activating sequences and enhancers), and to understand the regulatory ...
Whole-genome prokaryotic clustering based on gene lengths
The fast-growing number of complete genome sequences prompts the development of new phylogenetic approaches. Until recently, understanding the phylogeny of prokaryotes was based on the comparison of highly conserved genes. Several novel whole-genome ...
Comments