Deep model-based feature extraction for predicting protein subcellular localizations from bio-images

Shao, Wei; Ding, Yi; Shen, Hong-Bin; Zhang, Daoqiang

doi:10.1007/s11704-017-6538-2

Deep model-based feature extraction for predicting protein subcellular localizations from bio-images

Research Article
Published: 02 March 2017

Volume 11, pages 243–252, (2017)
Cite this article

Frontiers of Computer Science Aims and scope Submit manuscript

Wei Shao¹,
Yi Ding¹,
Hong-Bin Shen² &
…
Daoqiang Zhang¹

144 Accesses
18 Citations
Explore all metrics

Abstract

Protein subcellular localization prediction is important for studying the function of proteins. Recently, as significant progress has been witnessed in the field of microscopic imaging, automatically determining the subcellular localization of proteins from bio-images is becoming a new research hotspot. One of the central themes in this field is to determine what features are suitable for describing the protein images. Existing feature extraction methods are usually hand-crafted designed, by which only one layer of features will be extracted, which may not be sufficient to represent the complex protein images. To this end, we propose a deep model based descriptor (DMD) to extract the high-level features from protein images. Specifically, in order to make the extracted features more generic, we firstly trained a convolution neural network (i.e., AlexNet) by using a natural image set with millions of labels, and then used the partial parameter transfer strategy to fine-tune the parameters from natural images to protein images. After that, we applied the Lasso model to select the most distinguishing features from the last fully connected layer of the CNN (Convolution Neural Network), and used these selected features for final classifications. Experimental results on a protein image dataset validate the efficacy of our method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Analysis of the Human Protein Atlas Image Classification competition

Article Open access 28 November 2019

Deep localization of subcellular protein structures from fluorescence microscopy images

Article 11 January 2022

Multiple Protein Subcellular Locations Prediction Based on Deep Convolutional Neural Networks with Self-Attention Mechanism

Article 23 January 2022

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

References

Chou K C, Shen H B. Cell-PLoc: a package ofWeb servers for predicting subcellular localization of proteins in various organisms. Nature protocols, 2008, 3(2): 153–162
Article Google Scholar
Pierleoni A, Martelli P L, Casadio R. MemLoci: predicting subcellular localization of membrane proteins in eukaryotes. Bioinformatics, 2011, 27(9): 1224–1230
Article Google Scholar
Xu Y Y, Yang F, Zhang Y, Shen H B. An image-based multi-label human protein subcellular localization predictor (iLocator) reveals protein mislocalizations in cancer tissues. Bioinformatics, 2013, 29(16): 2032–2040
Article Google Scholar
Hung MC, Link W. Protein localization in disease and therapy. Journal of Cell Science, 2011, 124(20): 3381–3392
Article Google Scholar
Xu Y Y, Yang F, Zhang Y, Shen H B. Bioimaging-based detection of mislocalized proteins in human cancers by semi-supervised learning. Bioinformatics, 2015, 31(7): 1111–1119
Article Google Scholar
Glory E, Newberg J, Murphy R F. Automated comparison of protein subcellular location patterns between images of normal and cancerous tissues. In: Proceedings of the 5th IEEE International Symposium on Biomedical Imaging. 2008
Google Scholar
Li J, Xiong L, Schneider J, Murphy R F. Protein subcellular location pattern classification in cellular images using latent discriminative models. Bioinformatics. 2012, 28(12): 32–39
Article Google Scholar
Shao W, Liu M, Zhang D. Human cell structure-driven model construction for predicting protein subcellular location from biological images. Bioinformatics, 2016, 32(1): 114–121
Google Scholar
Galar M, Fernández A, Barrenechea E, Bustince H, Herrera F. An overview of ensemble methods for binary classifiers in multi-class problems: experimental study on one-vs-one and one-vs-all schemes. Pattern Recognition, 2011, 44(8): 1761–1776
Article Google Scholar
Gu B, Sun X, Sheng V S. Structural minimax probability machine. IEEE Transactions on Neural Networks and Learning Systems, 2016, doi:10.1109/TNNLS.2016.2527796
Google Scholar
Wen X Z, Shao L, Xue Y, Fang W. A rapid learning algorithm for vehicle classification. Information Sciences, 2015, 295(1): 395–406
Article Google Scholar
Glorot X, Bordes A, Bengio Y. Domain adaptation for large-scale sentiment classification: a deep learning approach. In: Proceedings of the 28th International Conference on Machine Learning. 2011
Google Scholar
Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell, T. Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACMinternational conference on Multimedia. 2014, 675–678
Google Scholar
Guyon I, Elissee A. An introduction to feature extraction. In: Guyon I, Nikravesh M, Gunn S, et al. eds. Feature Extraction. Studies in Fuzziness and Soft Computing, Vol 207. Springer Berlin Heidelberg, 2006, 1–25
Google Scholar
Boland M V, Murphy R F. A neural network classifier capable of recognizing the patterns of all major subcellular structures in fluorescence microscope images of HeLa cells. Bioinformatics, 2001, 17(12): 1213–1223
Article Google Scholar
Tahir M, Khan A. Protein subcellular localization of fluorescence microscopy images: employing new statistical and Texton based image features and SVM based ensemble classification. Information Sciences An International Journal, 2016, 345(C): 65–80
Article Google Scholar
Newberg J, Murphy R F. A framework for the automated analysis of subcellular patterns in human protein atlas images. Journal of Proteome Research, 2008, 7(6): 2300–2308
Article Google Scholar
Nanni L, Lumini A, Brahnam S. Local binary patterns variants as texture descriptors for medical image analysis. Artificial Intelligence in Medicine, 2010, 49(2): 117–125
Article Google Scholar
Yang F, Xu Y Y, Wang S T, Shen H B. Image-based classification of protein subcellular location patterns in human reproductive tissue by ensemble learning global and local features. Neurocomputing, 2014, 131(9): 113–123
Article Google Scholar
Godil A, Lian Z, Wagan A. Exploring local features and the Bag-of-Visual-Words approach for bioimage classification. In: Proceedings of the 17th ACM International Conference on Bioinformatics, Computational Biology and Biomedical Informatics. 2013
Google Scholar
Coelho L P, Kangas J D, Naik AW, Osuna-Highley E, Glory-Afshar E, Fuhrman M, Simha R, Berget P B, Jarvik J W, Murphy R F. Determining the subcellular location of new proteins from microscope images using local features. Bioinformatics, 2013, 29(18): 2343–2349
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks. In: Proceedings of Advances in Neural Information Processing Systems. 2012, 1097–1105
Google Scholar
Sun Q, Amin M, Yan B, Martell C, Markman V, Bhasin A, Ye J. Transfer learning for bilingual content classification. In: Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2015, 2147–2156
Chapter Google Scholar
Uhlén M, Ponten F. Antibody-based proteomics for human tissue profiling. Molecular and Cellular Proteomics, 2005, 4(4): 384–393
Article Google Scholar
Uhlén M, Fagerberg L, Hallström B M, Lindskog C, Oksvold P, Mardinoglu A, Sivertsson A, Kampf C, Sjöstedt E, Asplund A, Olsson I, Edlund K, Lundberg E, Navani S, Szigyarto C A K, Odeberg J, Djureinovic D, Takanen J O, Hober S, Alm T, Edqvist P H, Berling H, Tegel H, Mulder J, Rockberg J, Nilsson P, Schwenk J M, Hamsten M, von Feilitzen K, Forsberg M, Persson L, Johansson F, Zwahlen M, von Heijne G, Nielsen J, Pontén F. Tissue-based map of the human proteome. Science, 2015, 347(6220): 1260419
Article Google Scholar
Uhlén M, Oksvold P, Fagerber L, Lundberg E, Jonasson K, Forsberg M, Zwahlen M, Kampf C, Wester K, Hober S, Wernerus H, Björling L, Ponten F. Towards a knowledge-based human protein atlas. Nature Biotechnology, 2010, 28(12): 1248–1250
Article Google Scholar
Wang W, Yang X, Ooi B C, Zhang D, Zhuang Y. Effective deep learning-based multi-modal retrieval. The VLDB Journal, 2016, 25(1): 79–101
Article Google Scholar
Pan Z, Deng Z T. Dimensionality reduction via kernel sparse representation. Frontiers of Computer Science. 2014, 8(5): 807–815
Article MathSciNet Google Scholar
Zhang Y Y, Zhang J C, Pan Z C, Zhang D Q. Multi-view dimensionality reduction via canonical random correlation analysis. Frontiers of Computer Science, 2016, 10(5): 856–869
Article Google Scholar
Tibshirani R. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society Series B (Methodological), 1996, 58(1): 267–288
MathSciNet MATH Google Scholar
Magerman D M. Statistical decision-tree models for parsing. In: Proceedings of the 33rd Annual Meeting on Association for Computational Linguistics. 1995, 276–283
Chapter Google Scholar
Hagan M T, Demuth H B, Beale M H, De Jesús O. Neural Network Design. Boston: PWS Publishing Company, 1996
Google Scholar
Dietterich T G, Bakiri G. Solving multiclass learning problems via error-correcting output codes. Journal of Artificial Intelligence Research, 1995, 2(1): 263–286
MATH Google Scholar
Escalera S, Tax DMJ, Pujol O, Radeva P, Duin R P. Subclass problemdependent design for error-correcting output codes. IEEE Transactions on Pattern Analysis andMachine Intelligence, 2008, 30(6): 1041–1054
Article Google Scholar
Pujol O, Radeva P, Vitria J. Discriminant ECOC: a heuristic method for application dependent design of error correcting output codes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(6): 1007–1012
Article Google Scholar
Chang C C, Lin C J. LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2011, 2(3): 27–32
Article Google Scholar
Lin T H, Murphy R F, Bar-Joseph Z. Discriminative motif finding for predicting protein subcellular localization. IEEE/ACMTransactions on Computational Biology and Bioinformatics, 2011, 8(2): 441–451
Article Google Scholar
Zhu L, Yang J, Shen H B. Multi label learning for prediction of human protein subcellular localizations. The Protein Journal, 2009, 28(9): 384–390
Article Google Scholar
Shen H B, Chou K C. A top-down approach to enhance the power of predicting human protein subcellular localization: Hum-mPLoc 2.0. Analytical Biochemistry, 2009, 394(2): 269–274
Article Google Scholar
Zhang D, Wang Y, Zhou L, Yuan H, Shen D, the Alzheimer’s Disease Neuroimaging Initiative. Multimodal classification of Alzheimer’s disease and mild cognitive impairment. Neuroimage, 2011, 55(3): 856–867
Article Google Scholar

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China (Grant Nos. 61422204, 61473149 and 61671288), Jiangsu Natural Science Foundation for Distinguished Young Scholar (BK20130034), and Science and Technology Commission of Shanghai Municipality (16JC1404300).

Author information

Authors and Affiliations

School of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, 211106, China
Wei Shao, Yi Ding & Daoqiang Zhang
Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Shanghai, 200240, China
Hong-Bin Shen

Authors

Wei Shao
View author publications
Search author on:PubMed Google Scholar
Yi Ding
View author publications
Search author on:PubMed Google Scholar
Hong-Bin Shen
View author publications
Search author on:PubMed Google Scholar
Daoqiang Zhang
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Daoqiang Zhang.

Additional information

Wei Shao received the BS and MS degrees from Nanjng University of Technology, China in 2009 and 2012, respectively. He is current working torward the PhD degree in computer science from Nanjing University of Aeronautics and Astronautics, China. His current research interest is bioinformatics.

Yi Ding received the BS degree in information security from Nanjing University of Aeronautics and Astronautics (NUAA), China in 2015. In the same year, he was admitted to study for MS degree at NUAA without entrance examination. He is currently a member of the PARNEC Group led by Songcan Chen and joined iBrain Group led by Daoqiang Zhang. His research interests mainly include machine learning and data mining.

Hong-Bin Shen received his PhD degree from Shanghai Jiao Tong University (SJTU), China in 2007. He was a postdoctoral research fellow of Harvard Medical School from 2007 to 2008, and a visiting professor of University of Michigan, USA in 2012. Currently, he is a distinguished professor of Institute of Image Processing and Pattern Recognition, SJTU. His research interests include pattern recognition and bioinformatics. He has published more than 100 journal papers and constructed 35 bioinformatics severs in these areas. He is the 2014 and 2015 ESI highly cited researcher. He serves as the associate editor of BMC Bioinformatics and the editorial board members of several international journals.

Daoqiang Zhang received the BS degree and PhD degree in computer science from Nanjing University of Aeronautics and Astronautics (NUAA), China in 1999 and 2004, respectively. He joined the Department of Computer Science and Engineering of NUAA as a lecturer in 2004, and is a professor at present. His research interests include machine learning, pattern recognition, data mining, and medical image analysis. In these areas, he has published over 100 scientific articles in refereed international journals such as Neuroimage, Pattern Recognition, Artificial Intelligence in Medicine, IEEE Trans. Neural Networks; and conference proceedings such as IJCAI, AAAI, SDM, ICDM.

Electronic supplementary material

Supplementary material, approximately 516 KB.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shao, W., Ding, Y., Shen, HB. et al. Deep model-based feature extraction for predicting protein subcellular localizations from bio-images. Front. Comput. Sci. 11, 243–252 (2017). https://doi.org/10.1007/s11704-017-6538-2

Download citation

Received: 14 November 2016
Accepted: 13 January 2017
Published: 02 March 2017
Issue Date: April 2017
DOI: https://doi.org/10.1007/s11704-017-6538-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep model-based feature extraction for predicting protein subcellular localizations from bio-images

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Analysis of the Human Protein Atlas Image Classification competition

Deep localization of subcellular protein structures from fluorescence microscopy images

Multiple Protein Subcellular Locations Prediction Based on Deep Convolutional Neural Networks with Self-Attention Mechanism

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Supplementary material, approximately 516 KB.

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now