Abstract
Many studies have found that sequence in the 5’ untranslated regions (UTRs) impacts the translation rate of an mRNA, but the regulatory grammar that underpins this translation regulation remains elusive. Deep learning methods deployed to analyse massive sequencing datasets offer new solutions to motif discovery. However, existing works focused on extracting sequence motifs in individual datasets, which may not be generalisable to other datasets from the same cell type. We hypothesise that motifs that are genuinely involved in controlling translation rate are the ones that can be extracted from diverse datasets generated by different experimental techniques. In order to reveal more generalised cis-regulatory motifs for RNA translation, we develop a multi-task translation rate predictor, MTtrans, to integrate information from multiple datasets. Compared to single-task models, MTtrans reaches a higher prediction accuracy in all the benchmarked datasets generated by various experimental techniques. We show that features learnt in human samples are directly transferable to another dataset in yeast systems, demonstrating its robustness in identifying evolutionarily conserved sequence motifs. Furthermore, our newly generated experimental data corroborated the effect of most of the identified motifs based on MTtrans trained using multiple public datasets, further demonstrating the utility of MTtrans for discovering generalisable motifs. MTtrans effectively integrates biological insights from diverse experiments and allows robust extraction of translation-associated sequence motifs in 5’UTR.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alipanahi, B., Delong, A., Weirauch, M.T., Frey, B.J.: Predicting the sequence specificities of DNA-and RNA-binding proteins by deep learning. Nature biotechnology 33(8), 831–838 (2015)
Andreev, D.E., et al.: Translation of 5‘ leaders is pervasive in genes resistant to eif2 repression. Elife 4, e03971 (2015)
Araujo, P.R., et al.: Before it gets started: regulating translation at the 5‘ UTR. Comp. Func. Genomics 2012 (2012)
Avsec, Ž, et al.: Base-resolution models of transcription-factor binding reveal soft motif syntax. Nat. Genet. 53(3), 354–366 (2021)
Baltz, A.G., et al.: The mRNA-bound proteome and its global occupancy profile on protein-coding transcripts. Mol Cell 46(5), 674–690 (2012)
Cao, J., et al.: High-throughput 5‘ UTR engineering for enhanced protein production in non-viral gene therapies. Nat. Commun. 12(1), 1–10 (2021)
Cuperus, J.T., et al.: Deep learning of the regulatory grammar of yeast 5‘ untranslated regions from 500,000 random sequences. Genome Res. 27(12), 2015–2024 (2017)
DeGrave, A.J., Janizek, J.D., Lee, S.I.: Ai for radiographic Covid-19 detection selects shortcuts over signal. Nat. Mach. Intell. 3(7), 610–619 (2021)
Dvir, S., et al.: Deciphering the rules by which 5‘-UTR sequences affect protein expression in yeast. Proc. Natl. Acad. Sci. 110(30), E2792–E2801 (2013)
Geirhos, R., et al.: Shortcut learning in deep neural networks. Nat. Mach. Intell. 2(11), 665–673 (2020)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
Hsieh, A.C., et al.: The translational landscape of mTOR signalling steers cancer initiation and metastasis. Nature 485(7396), 55–61 (2012)
Ingolia, N.T., Ghaemmaghami, S., Newman, J.R., Weissman, J.S.: Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science 324(5924), 218–223 (2009)
Jackson, R.J., Hellen, C.U., Pestova, T.V.: The mechanism of eukaryotic translation initiation and principles of its regulation. Nat. Rev. Mol. Cell Biol. 11(2), 113–127 (2010)
Karollus, A., Avsec, Ž, Gagneur, J.: Predicting mean ribosome load for 5’UTR of any length using deep learning. PLoS Comput. Biol. 17(5), e1008982 (2021)
Koo, P.K., Eddy, S.R.: Representation learning of genomic sequence motifs with convolutional neural networks. PLoS Comput. Biol. 15(12), e1007560 (2019)
Kozak, M.: An analysis of 5‘-noncoding sequences from 699 vertebrate messenger RNAS. Nucl. Acids Res. 15(20), 8125–8148 (1987)
Li, J.J., Chew, G.L., Biggin, M.D.: Quantitative principles of cis-translational control by general mRNA sequence features in eukaryotes. Genome Biol. 20(1), 1–24 (2019)
Lin, J.C., Hsu, M., Tarn, W.Y.: Cell stress modulates the function of splicing regulatory protein RBM4 in translation control. Proc. Natl. Acad. Sci. 104(7), 2235–2240 (2007)
Lotfollahi, M., et al.: Mapping single-cell data to reference atlases by transfer learning. Nat. Biotechnol. 40(1), 121–130 (2022)
Noderer, W.L., et al.: Quantitative analysis of mammalian translation initiation sites by FACS-seq. Mol. Syst. Biol. 10(8), 748 (2014)
Novakovsky, G., Saraswat, M., Fornes, O., Mostafavi, S., Wasserman, W.W.: Biologically relevant transfer learning improves transcription factor binding prediction. Genome Biol. 22(1), 1–25 (2021)
Ray, D., et al.: A compendium of RNA-binding motifs for decoding gene regulation. Nature 499(7457), 172–177 (2013)
Riba, A., et al.: Protein synthesis rates and ribosome occupancies reveal determinants of translation elongation rates. Proc. Natl. Acad. Sci. 116(30), 15023–15032 (2019)
Sample, P.J., et al.: Human 5‘ UTR design and variant effect prediction from a massively parallel translation assay. Nat. Biotechnol. 37(7), 803–809 (2019)
Shrikumar, A., Greenside, P., Kundaje, A.: Learning important features through propagating activation differences. In: International Conference on Machine Learning, pp. 3145–3153. PMLR (2017)
Vaswani, A., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017)
Wein, N., et al.: Translation from a DMD exon 5 IRES results in a functional dystrophin isoform that attenuates dystrophinopathy in humans and mice. Nat. Med. 20(9), 992–1000 (2014)
Weinberg, D.E., Shah, P., Eichhorn, S.W., Hussmann, J.A., Plotkin, J.B., Bartel, D.P.: Improved ribosome-footprint and mRNA measurements provide insights into dynamics and regulation of yeast translation. Cell Rep. 14(7), 1787–1799 (2016)
Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? arXiv preprint arXiv:1411.1792 (2014)
Zeng, H., Edwards, M.D., Liu, G., Gifford, D.K.: Convolutional neural network architectures for predicting DNA-protein binding. Bioinformatics 32(12), i121–i127 (2016)
Acknowledgements
We thank Dr. Sung Chul Kwon for his helpful suggestions on transcript filtering and Dr. Chen Qiao for model training. We also thank Xinyi Lin and Yiming Chao for their feedback on data visualisation and pipeline testing.
This work was supported in part by AIR@InnoHK administered by Innovation and Technology Commission and the National Natural Science Foundation of China Excellent Young Scientists Fund (32022089). The work is also supported by the Centre for Oncology and Immunology Limited under the Health@InnoHK Initiative funded by the Innovation and Technology Commission, The Government of Hong Kong SAR, China.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
Availability of Data and Materials
The code to re-implement MTtrans can be access from https://github.com/holab-hku/MTtrans and the FACS library is also available from Gene Expression Omnibus (GEO) under the accession of GSE201766.
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
A Appendix
A Appendix
There is one additional file containing supplementary methods, supplementary Tables 1–2 and supplementary Figs. 1–8.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zheng, W. et al. (2023). Translation Rate Prediction and Regulatory Motif Discovery with Multi-task Learning. In: Tang, H. (eds) Research in Computational Molecular Biology. RECOMB 2023. Lecture Notes in Computer Science(), vol 13976. Springer, Cham. https://doi.org/10.1007/978-3-031-29119-7_9
Download citation
DOI: https://doi.org/10.1007/978-3-031-29119-7_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-29118-0
Online ISBN: 978-3-031-29119-7
eBook Packages: Computer ScienceComputer Science (R0)