ABSTRACT
The accurate annotation of a protein function is important for understanding life at molecular level. Nowadays, powerful high throughput proteomics technologies provide an unprecedented understanding of the human biology and disease. These technologies are generating a deluge of protein sequences available in public databases. However, a critical challenge in making sense of these sequences is the assignment of functional roles to newly discovered proteins. The approaches proposed to address this problem use a variety of biological information, such as amino acid sequence, gene expression and protein-protein interaction. By another way, deep learning has emerged as the innovation of this last decade as it uses deep architectures to learn representations of high level entities and creates an improved functional space. In this paper, we propose an approach that proposes a deep neural network to achieve classification of oxygen binding proteins using amino acid composition for protein function prediction. Two alternatives are investigated. The first one casts the tackled problem as a multiclass classification problem and the second one as a binary classification problem. The validation of the approach is achieved using Keras platform and very promising and encouraging results that outperform other state of the art results have been obtained.
- Wright, P. C., Noirel, J., Ow, S. Y., and Fazeli, A. A review of current proteomics technologies with a survey on their widespread use in reproductive biology investigations. Theriogenology. 77(4): 738--765.e52 (2012)Google ScholarCross Ref
- Front Matter: Defining the mandate of proteomics in the post-genomics era. Workshop report, the National Academy of Sciences (2002)Google Scholar
- Wang, S., Qu, M., and Peng, J: PROSNET: Integrating homology with molecular networks for protein function prediction. Pac SympBiocomput. 22, 27--38 (2017)Google Scholar
- Cao, R. and Cheng, J. Integrated protein function prediction by mining associations, sequences, and protein-protein and gene-gene interaction networks. Methods. 93, 84--91 (2016)Google ScholarCross Ref
- Mousumi Debnath, Godavarthi B. K. S. Prasad, Prakash S. Bisen: Molecular diagnostics: Promises and possibilities: Omics Technology. pp. 11--31. Springer. Dordrech Heidelberg London (2010)Google Scholar
- Michele Magrane, UniProt Consortium: UniProt knowledgebase: a hub of integrated protein data. Database (Oxford) (2011)Google Scholar
- Gaurav, P., Vipin, K., Michael, S. Computational approaches for protein function prediction: a survey. technical report, University of Minnesota (2006)Google Scholar
- Nucleic Acids Res. The InterPro protein families database: the classification resource after 15 years. Database issue. 43, D213-D221 (2014)Google Scholar
- Sequence analysis: InterProScan 5: genome-scale protein function classification. Bioinformatics. 30, 1236--1240 (2014)Google ScholarCross Ref
- Lawrence A. Kelley, Stefans Mezulis, Christopher M Yates, Mark N Wass, Michael J E Sternberg: The Phyre2 web portal for protein modeling, prediction and analysis. Nature Protocols. 10, 845--858 (2015)Google ScholarCross Ref
- Jiang et al: An expanded evaluation of protein function prediction methods shows an improvement in accuracy. Genome Biology. 17: 184 (2016)Google ScholarCross Ref
- Choi, Y. W. and Chan, A. P. PROVEAN web server: a tool to predict the functional effect of amino acid substitutions and indels. Bioinformatics. 31, 2745--2747 (2015)Google ScholarCross Ref
- Nikhil, B. Fundamentals of deep learning: Designing next-generation machine intelligence algorithms. O'Reilly Media, Beijing, Boston, Farnham, Sebastopol, Tokyo (2017) Google ScholarDigital Library
- Cun, Y. L., Bengio, Y. S., and Geoffrey, H. Deep learning. Nature. 521, 436--444 (2015)Google ScholarCross Ref
- Min, S., Lee, B., Yoon, S. Deep learning in bioinformatics. Brief Bioinform. 18(5), 851--869 (2017)Google Scholar
- Alipanahi, B., Delong, A., Weirauch, M. T., and Frey, B. J.Predicting the sequence specificities of DNA and RNA binding proteins by deep learning. Nat Biotechnol. 33(8), 831--838 (2015)Google ScholarCross Ref
- Zhang, S., Zhou, J., Hu, H., Gong, H., Chen, L., Cheng, C., and Zeng, J. A deep learning framework for modeling structural features of RNA-binding protein targets. Nucleic Acids Res. 44(4), e32 (2016)Google ScholarCross Ref
- Wang, S, Weng, S., Ma, J., and Tang, Q. DeepCNF-D: Predicting protein order/disorder regions by weighted deep convolutional neural fields. Int J Mol Sci. 16(8), 17315--17330 (2015)Google ScholarCross Ref
- Costa-Paiva, E. M., Schrago, C. G., and Halanych, K. M. Broad phylogenetic occurrence of the oxygen-binding Hemerythrin in Bilaterians. Genome BiolEvol. 9(10), 2580--2591 (2017)Google Scholar
- Muthukrishnan, S., Garg, A., G.P.S. Raghava: Oxypred: Prediction and classification of oxygen-binding proteins. Genomics Proteomics Bioinformatics. 5(3-4), 250--252 (2007)Google ScholarCross Ref
- Matt, C. Classification of oxygen binding proteins using Random Forest Machine Learning. http://rpubs.com/oaxacamatt/Random_Forest_Oxygen_Binders (2017)Google Scholar
- Decker, H. andTerwilliger, N. Cops and robbers: putative evolution of copper oxygen-binding proteins. {J}ExpBiol. 203(Pt12), 1777--1782 (2000)Google Scholar
- Cinzia, V. et al: Structure, function and molecular adaptations of haemoglobins of the polar cartilaginous fish Bathyrajaeatonii and Raja hyperborea. Biochem J. 389(Pt2), 297--306 (2005)Google ScholarCross Ref
- Struttmann, T., Scheerer, A., Prince, T. S., and Goldstein L. A. Unintentional carbon monoxide poisoning from an unlikely source. J Am Board FamPract. 11(6), 481--484 (1998)Google ScholarCross Ref
- Senan, J. Y., Vivian Irene Ravn Berg, Asimahmad, Donald Doll: Hemoglobin Titusville: a rare low oxygen affinity hemoglobinopathy. Clin Case Rep. 5(6), 1011--1012 (2017)Google ScholarCross Ref
- Najafabadi et al: Deep learning applications and challenges in big data analytics. Journal of Big Data. 2:1 (2015)Google ScholarCross Ref
- Liu, W. et al: A survey of deep neural network architectures and their applications. Neurocomputing. 234, 11--26 (2017)Google ScholarCross Ref
- Juergen, S. Deep learning in neural networks: An overview. Neural Networks. 61, 85--117 (2015) Google ScholarDigital Library
- Mariette, A. and Rahul, K. Deep Neural Networks. In: Efficient learning Machines. pp.127--147. SpringerLink, Apress, Berkeley, CA (2015)Google Scholar
- Nitish, S., Geoffrey, H. et al: Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research. 15, 1929--1958 (2014) Google ScholarDigital Library
- Cybenko, G. Approximation by superpositions of a Sigmoidal function. Math. Control Signals Systems. 2, 303--314 (1989)Google ScholarCross Ref
- Xiao, N., Cao, D. S., Zhu, M. F., and Xu, Q. S.protr/ProtrWeb: R package and web server for generating various numerical representation schemes of protein sequences. Bioinformatics. 31(11), 1857--1859 (2015)Google ScholarCross Ref
- Diederic, P. K., and Jimmy, B. Adam: A method for stochastic optimization. Inthe 3rd International Conference for Learning Representations, San Diego (2015)Google Scholar
Index Terms
- Deep Neural Network for Classification and Prediction of Oxygen Binding Proteins
Recommendations
Multi-descriptor approaches to oxygen binding proteins prediction and classification using deep learning
Oxygen binding proteins play a key role in the transport and storage of oxygen through the body's cells. However, costly and time consuming biological tests can only determine a very small portion of all proteins available. This has made computational ...
A protein sequence meta-functional signature for calcium binding residue prediction
The diversity of characterized protein functions found amongst experimentally interrogated proteins suggests that a vast array of unknown functions remains undiscovered. These protein functions are imparted by specific geometric distributions of amino ...
Ligand-binding prediction in the resistance-nodulation-cell division (RND) proteins
The resistance-nodulation-cell division (RND) protein family is a ubiquitous group of proteins primarily present in bacteria. These proteins, involved in the transport of multiple drugs across the cell envelope in bacteria, exhibit broad substrate ...
Comments