Abstract
The annotation of transcription binding sites in new sequenced genomes is an important and challenging problem. We have previously shown how a regression model that linearly relates gene expression levels to the matching scores of nucleotide patterns allows us to identify DNA-binding sites from a collection of co-regulated genes and their nearby non-coding DNA sequences. Our methodology uses Bayesian models and stochastic search techniques to select transcription factor binding site candidates. Here we show that this methodology allows us to identify binding sites in nearby species. We present examples of annotation crossing from Schizosaccharomyces pombe to Schizosaccharomyces japonicus. We found that the eng1 motif is also regulating a set of 9 genes in S. japonicus. Our framework may have an effective interest in conveying information in the annotation process of a new species. Finally we discuss a number of statistical and biological issues related to the identification of binding sites through covariates of genes expression and sequences.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Brown, P.J., Vannucci, M., Fearn, T.: Multivariate bayesian variable selection and prediction. J. R. Stat. Soc. Ser. B 60, 627–641 (1998)
Bullerwell, C.E., Leigh, J., Forget, L., Lang, B.F.: A comparison of three fission yeast mitochondrial genomes. Nucleic Acids Res. 31(2), 759–768 (2003)
Conlon, E.M., Song, J.J., Liu, J.S.: Bayesian models for pooling microarray studies with multiple sources of replications. BMC Bioinformatics 7, 247 (2006)
Conlon, E.M, Liu, X.S., Lieb, J.D, Liu, J.S: Integrating regulatory motif discovery and genome-wide expression analysis. Proc. Natl. Acad. Sci. 100(6), 3339–3344 (2003)
Liu, X.S., Brutlag, D.L., Liu, J.S.: An algorithm for finding protein-dna binding sites with applications to chromatin-immunoprecipitation microarray experiments. Nat. Biotechnol. 20(8), 835–839 (2002)
Oliva, A., Rosebrock A., Ferrezuelo, F., Pyne S., Chen, H., Skiena, S., Futcher, B., Leatherwood, J.: The cell cycle-regulated genes of schizosaccharomyces pombe. PLoS Biol. 3(7), e225 (2005)
Rajewsky, N., Socci, N.D., Zapotocky, M., Siggia, E.D.: The evolution of dna regulatory regions for proteo-gamma bacteria by interspecies comparisons. Genome Res. 12(2), 298–308 (2002)
Tadesse, M.G, Vannucci, M., Liò, P.: Identification of dna regulatory motifs using bayesian variable selection. Bioinformatics 20(16), 2553–2561 (2004)
van Helden, J., Andr, B., Collado-Vides, J.: Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies. J. Mol. Biol. 281(5), 827–842 (1998)
Whelan, S., Li, P., Goldman, N.: Molecular phylogenetics: state-of-the-art methods for looking into the past. Trends Genet. 17(5), 262–272 (2001)
Zwick, M.E., Mcafee, F., Cutler, D.J., Read, T.D., Ravel, J., Bowman, G.R., Galloway, D.R., Mateczun, A.: Microarray-based resequencing of multiple bacillus anthracis isolates. Genome Biol. 6(1), R10 (2005)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Angelini, C., Cutillo, L., De Feis, I., van der Wath, R., Lio’, P. (2007). Identifying Regulatory Sites Using Neighborhood Species. In: Marchiori, E., Moore, J.H., Rajapakse, J.C. (eds) Evolutionary Computation,Machine Learning and Data Mining in Bioinformatics. EvoBIO 2007. Lecture Notes in Computer Science, vol 4447. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71783-6_1
Download citation
DOI: https://doi.org/10.1007/978-3-540-71783-6_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71782-9
Online ISBN: 978-3-540-71783-6
eBook Packages: Computer ScienceComputer Science (R0)