Elsevier

Neurocomputing

Volume 375, 29 January 2020, Pages 71-79
Neurocomputing

DeepETC: A deep convolutional neural network architecture for investigating and classifying electron transport chain's complexes

https://doi.org/10.1016/j.neucom.2019.09.070Get rights and content

Abstract

An electron transport chain is a series of protein complexes embedded in the transport protein, which is an important process to transfer electrons and other macromolecules throughout the cell. It is the primary process to extract energy via redox reactions in the case of oxidation of sugars in cellular respiration. According to the molecular functions, the components of the electron transport chain could be formed with five complexes and with several different electron carriers. The functional loss of a specific molecular function in electron transport chain has been implicated in a variety of human diseases such as diabetes, neurodegenerative disorders, Parkinson, and Alzheimer's disease. Therefore, creating a precise model to identify its functions is pertinent to the understanding of human diseases and designing of drug targets. Previous bioinformatics studies have almost exclusively focused on the electron transport proteins without information on the five complexes. Here we present DeepETC, a deep learning model that uses a two-dimensional convolutional neural network and position-specific scoring matrices profiles to classify electron transport proteins into the five complexes. DeepETC can classify the electron transporters with the independent test accuracy of 99.6%, 99.7%, 99.7%, 99.1% and 99.8% for complex I, II, III, IV, and V, respectively. Our performance results are significantly more accurate than the state-of-the-art traditional neural networks in all typical measurement metrics. Throughout the proposed study, we provide an effective tool for investigating electron transport proteins and our achievement could promote the use of deep learning in bioinformatics and computational biology. DeepETC can be freely accessible via http://www.biologydeep.com/deepetc/.

Introduction

Cellular respiration is a mechanism that creates adenosine triphosphate (ATP) and aids cells in obtaining energy from food molecules (i.e., sugar). To achieve this goal, cellular respiration uses a complex of proteins to accumulate electrons, which are called electron transport chains [1]. Fig. 1 (adapted from [2]) indicates the process of the electron transport chain; a pathway to store and transfer electrons in cellular respiration. The electron transport chain can be categorized into five complexes: complex I, II, III, IV, and V (ATP Synthase). Each complex consists of different electron carriers and it executes various molecular functions [3]. An electron would be donated to complex I from NADH and sequentially passed to complex II, III, IV, and V. During this movement, the hydrogen ions, or protons, pump across the membrane and release the water molecules (H2O). Complex V uses the energy created by the pumping process to phosphorylate adenosine diphosphate (ADP) to ATP.

Numerous types of electron transport proteins have been identified in humans. A series of studies conducted indicated that in many diseases, there was a functional loss of specific complexes in the electron transport protein. For instance, in [4], all the Parkinson's disease patients had a significant reduction of complex I (NADH dehydrogenase activity). From this result, it can be hypothesized that the complex I abnormality may have an etiological role in the pathogenesis of Parkinson's disease. Parker et al. [5] suggest that in Alzheimer's disease patients, there is a cytochrome c oxidase deficiency (complex IV) in the terminal portion of the electron transport chain and in the platelet mitochondria. Mutations in BCS1, which is an assembly factor for complex III, are associated with a syndrome called GRACILE (growth retardation, aminoaciduria, cholestasis, iron overload, lactic acidosis, and early death) [6]. The ubiquinone of complex III is regarded as a major site of reactive oxygen species generation, which plays a crucial role in the aging process and the pathogenesis of neurodegenerative diseases [7]. The complex IV of the electron transport chain has been linked to the pathogenesis of diabetes mellitus [8]. Thus, the classification of the electron transport protein complexes would help biologists better understand that the molecular functions in human diseases is an essential problem. This would then spur them on to develop bioinformatics techniques to resolve it.

Recently, there have been many published scholarships on electron transport proteins using computational techniques. Its popularity can be attributed to the fact that electron transport proteins play an essential role in cellular respiration, energy production, and human diseases. For instance, one of the most prominent studies done is on TCDB [9], which is a web-accessible, curated, relational database containing the sequence, classification, structural, functional and evolutionary information on transport systems, including electron transport proteins from a variety of living organisms. Research done by Gromiha discriminates the function of electron transport proteins from membrane proteins using machine learning techniques [10]. According to Chen [11], in the experiment, the transport targets were divided into four types, including the electron transporters, for prediction and analysis. The analysis was done using the amino acid composition (AAC), amino acid pair composition (DPC) and position specific scoring matrix (PSSM). The property of these three attributes continued to individually perform cross-merger forecast and group data using ten-fold cross-validation method to do the performance evaluation. Furthermore, Le et al. [12] implemented deep learning framework and PSSM in their study for accurate identification of electron transport proteins.

Research based on previous studies can only be considered as the first step toward a more profound understanding of electron transport proteins. A new approach is therefore needed to investigate the details of the electron transport protein's complexes. Even the previous work from Le et al. [13] has investigated the molecular functions of electron transport chain, however, they used a small set of data with a shallow neural network. Here we present DeepETC, a web server for classifying electron transport protein's complexes using deep learning on a bigger dataset. The idea of constructing a 2D convolutional neural network (CNN) from PSSM profiles has been presented in earlier works [12,14,15], and here we extend this approach with a different dataset and a more in-depth analysis. We document several vital contributions of our study to the field of biology: (1) a database for collecting all the complexes of electron transport proteins, (2) a first computational model to genuinely classify electron transport protein into their complexes using deep learning, which has been successfully applied in some biological applications yielding outperformed results [16], [17], [18], [19], (3) a benchmark dataset and newly discovered data for further study on electron transport chain (4) a study that would provide much information to biologists and researchers, allowing them to better understand the electron transport protein structures and to conduct the future research.

Section snippets

Materials and methods

Most experiments have been carried out with a 2D CNN and PSSM profiles. There are four steps in our methodology: data collection, feature extraction, CNN implementation, and model evaluation. Our flowchart is illustrated in Fig. 2 and described in detail in the following paragraphs.

Amino acid composition of five complexes in electron transport chain

We investigated the distribution of the amino acid composition and the variance among five complexes of the electron transport chain. Fig. 3 shows the amino acids which have had the substantially highest frequency in five distinct datasets. It is no doubt to say that there were a number of differences among the five complexes of electron transport chain in their amino acid compositions. For instance, the amino acid E could be adopted for classifying complexes III and V; the amino acid C could

Conclusion

Deep learning has been increasingly used in different fields, especially in bioinformatics and computational biology, with significant results. According to its development, this study presents DeepETC, a web server for storing and identifying the electron transport chain's complexes through the use of deep learning. Our DeepETC contains two primary functions e.g., a database contains all electron transport chain's information and a submission page contains deep learning models to classify

Declaration of Competing Interest

The authors have no conflicts of interest to disclose.

Acknowledgments

This research is partially supported by the Nanyang Technological University Start-Up Grant and the Ministry of Science and Technology, Taiwan, R.O.C. under Grant no. MOST 106-2221-E-155-068.

Nguyen Quoc Khanh Le is an Assistant Professor with the Professional Master Program in Artificial Intelligence in Medicine, Taipei Medical University, Taiwan. He received his MS and PhD degree in Department of Computer Science and Engineering, Graduate Program in Biomedical Informatics, Yuan Ze University, Taiwan. After obtaining his PhD degree, he used to work as a Research Fellow at the School of Humanities, Nanyang Technological University, Singapore. His-research interests are Artificial

References (38)

  • A. Baldominos et al.

    Evolutionary convolutional neural networks: an application to handwriting recognition

    Neurocomputing

    (2018)
  • LinC. et al.

    LibD3C: ensemble classifiers with a clustering and dynamic selection strategy

    Neurocomputing

    (2014)
  • B. Chance et al.

    The respiratory chain and oxidative phosphorylation

    Adv. Enzymol. Relat. Areas Mol. Biol.

    (1956)
  • C.H. Foyer et al.

    Relationships between antioxidant metabolism and carotenoids in the regulation of photosynthesis

    The Photochemistry of Carotenoids

    (1999)
  • W.D. Parker et al.

    Abnormalities of the electron transport chain in idiopathic parkinson's disease

    Ann. Neurol.

    (1989)
  • W.D. Parker et al.

    Cytochrome oxidase deficiency in alzheimer's disease

    Neurology

    (1990)
  • LiuY. et al.

    Generation of reactive oxygen species by the mitochondrial electron transport chain

    J. Neurochem.

    (2002)
  • D.P. Friday et al.

    The impact of diabetes mellitus on oxygen utilization by complex IV: preliminary insights

    J. Endocrinol. Metab

    (2017)
  • M.H. Saier et al.

    TCDB: the transporter classification database for membrane transport protein analyses and information

    Nucleic Acids Res.

    (2006)
  • Cited by (48)

    • FEDA: Fine-grained emotion difference analysis for facial expression recognition

      2023, Biomedical Signal Processing and Control
      Citation Excerpt :

      ‘Contempt’ is completely separated when the cluster number k ≥ 8, which means that ‘contempt’ can be defined as a basic emotion independent of neutral and six kinds. To validate the quality of variable fine-grained emotion expressions, we trained the model using a 10-fold cross-validation technique on the training sets with different fine-grained emotion labels relabelled and evaluated the performance using independent tests to avoid any systematic bias in the cross-validation set [42,43]. The recognition accuracies of four datasets with initial crowdsourcing emotion (IE) and fine-grained emotion (FE) labels under the default VGG-13 classifier are shown in Table 3.

    • OAU-net: Outlined Attention U-net for biomedical image segmentation

      2023, Biomedical Signal Processing and Control
      Citation Excerpt :

      Moreover, Magnetic Resonance Imaging (MRI), Computed Tomography (CT), X-ray, Ultrasonic imaging, and Positron Emission Tomography (PET) are the main imaging methods at present, among which CT is widely used [1–3]. Coincidentally, deep learning methods are widely used in biomedical fields such as protein detection [4,5]. In fact, biomedical image segmentation task is usually defined as identifying the contour or internal voxel set of the object of interest, which is the most common topic in the field of deep learning applied to medical image analysis.

    • Adaptive Correlation Integration for Deep Image Clustering

      2022, Neurocomputing
      Citation Excerpt :

      Besides, we also make a brief overview of the related works on cluster ensemble which also realize the integration of different correlations and analyze the distinction. With deep learning developing, DNN has been widely applied in many fields such as natural language processing [10,11], computer vision [12,13], fault detection [14,15], and bioinformatics [16,17]. Its powerful representation ability and flexible training rules have made significant achievements on feature extraction of high-dimensional data and process of large-scale complex data relationships.

    View all citing articles on Scopus

    Nguyen Quoc Khanh Le is an Assistant Professor with the Professional Master Program in Artificial Intelligence in Medicine, Taipei Medical University, Taiwan. He received his MS and PhD degree in Department of Computer Science and Engineering, Graduate Program in Biomedical Informatics, Yuan Ze University, Taiwan. After obtaining his PhD degree, he used to work as a Research Fellow at the School of Humanities, Nanyang Technological University, Singapore. His-research interests are Artificial Intelligence, Machine Learning, Data Analyst, Bioinformatics, Computational Biology, and Healthcare Informatics.

    Quang-Thai Ho received his MS degree in Department of Computer Science and Engineering, Graduate Program in Biomedical Informatics, Yuan Ze University, Taiwan. Currently, he continues his PhD and is working as a Research Assistant in the same department. His-research interests are Artificial Intelligence, Machine Learning, Deep Learning, Computer Vision, Big Data Analytics, Web Development, Bioinformatics, and Computational Biology.

    Edward Kien Yee Yapp is a Research Fellow at the Singapore Institute of Manufacturing Technology. His-interests are in combustion and artificial intelligence. His-current research focuses on applications of machine learning in manufacturing. He is the author and coauthor of a number of papers in scholarly journals, including “A fully coupled simulation of PAH and soot growth with a population balance model” in Proceedings of the Combustion Institute, “Numerical simulation and parametric sensitivity study of particle size distributions in a burner-stabilized stagnation flame” in Combustion and Flame, “The polarization of polycyclic aromatic hydrocarbons curved by pentagon incorporation: the role of the flexoelectric dipole” in Journal of Physical Chemistry C, and “A moment projection method for population balance dynamics with a shrinkage term” in Journal of Computational Physics.

    Yu-Yen Ou is an Associate Professor in the Department of Computer Science and Engineering, Graduate Program in Biomedical Informatics, Yuan Ze University, Taiwan. He received the B.S. degree in Department of Math and Computer Science Education, Taipei Municipal Teachers College and the Ph.D. degree in Department of Computer Science and Information Engineering, National Taiwan University, Taiwan. His-fields of professional interest are Bioinformatics, Machine Learning, and Data Mining.

    Hui-Yuan Yeh is an Assistant Professor in the Medical Humanities Research Cluster at the School of Humanities, Nanyang Technological University, Singapore. She received her PhD degree in the Department of Biological Anthropology, University of Cambridge. Her research interests lie in Human Evolution, and the intersection with Bioinformatics and Machine Learning.

    View full text