Discovering driver nodes in chronic kidney disease-related networks using Trader as a newly developed algorithm

https://doi.org/10.1016/j.compbiomed.2022.105892Get rights and content

Highlights

  • A novel optimization algorithm named Trader was used to spot driver nodes in the chronic kidney disease-related PPI networks.

  • The Trader algorithm outperformed four other well-known algorithms with producing further disjoint sets in the networks.

  • The Trader was able to successfully identify a list of potential therapeutic targets in the kidney disease-related networks.

Abstract

Thanks to the advances in the field of computational-based biology, a huge volume of disease-related data has been generated so far. From the existing data, the disease-related protein-protein interaction (PPI) networks seem to yield effective treatment plans due to the informative/systematic representation of diseases. Yet, a large number of previous studies have failed due to the complex nature of such disease-related networks. For addressing this limitation, in the present study, we combined Trader and the DFS algorithms to identify a minimal subset of nodes (driver nodes) whose removal produces a maximum number of disjoint sub-networks. We then screened the nodes in the disease-associated PPI networks and to evaluate the efficiency of the suggested method, it was applied to six PPI networks of differentially expressed genes in chronic kidney diseases. The performance of Trader was superior to other well-known algorithms in terms of identifying driver nodes. Besides, the proportion of proteins that were targeted by at least one FDA-approved drug was significantly higher among the identified driver nodes when compared with the rest of the proteins in the networks. The proposed algorithm could be applied for predicting future therapeutic targets in complex disorder networks. In conclusion, unlike the common methods, computationally efficient algorithms can generate more practical outcomes which are compatible with real-world biological facts.

Introduction

Identifying novel and efficient therapeutic targets is a pressing need for clinical pharmaceutics. In this regard, predicting the key molecules in disease conditions is not only crucial for life science research, but it may also help develop the novel medicines [[1], [2], [3]]. Medical investigations like a gene knockout technique and the RNA interference (RNAi) mechanism have conventionally been used to identify such molecules [4]. However, thanks to the high-resolution analytical 'omics' platforms, researchers can access the big biological data associated with various disease conditions and identify the key molecules [5,6]. Big biological data can be presented as networks, as a complicated series of binary interactions or relationships between different biological elements [7,8]. When the availability of high-throughput data as networks accompanies systems biology, machine learning, and different bio-computing tools, the initial search space for the prediction of key molecules can be narrowed considerably. As a result, the identified molecules can be considered therapeutic targets since they are believed to play a central role in regulating other nodes and governing the pathological pathways [[9], [10], [11], [12], [13]]. According to a rule called “the centrality-lethality rule” there are only a limited number of highly-linked nodes (known as the hubs) in protein-protein interaction (PPI) networks whose deletion are more likely to be lethal for the system [1,[14], [15], [16], [17], [18]]. However, apart from the centrality measures, such hub molecules could be identified via other strategies. Generally, the main biological concept behind identifying the hub molecules in the disease-related networks would be to knock out them in experimental research or target them using small chemical/non-chemical molecules in therapeutic strategies.

So far, various computational-based studies have been conducted to identify the key molecules in biological networks. The prior investigations can be divided into four main categories, including:

  • A)

    Network topology-based studies; These approaches which focus on the features of a given disease-associated network (e.g., degree, closeness, and betweenness of nodes), introduce the kind of key molecules that can control all the other nodes in the network [[19], [20], [21]]. However, erroneous networks may affect the performance of these techniques. To deal with this limitation, in a study, an algorithm, named WDNFinder, has been proposed, combining the structural properties of a network with the nodes' strength. To evaluate the method, the researchers applied it to the human cancer signaling and P53-mediated DNA damage response networks. Based on the results, the identified key molecules were consistent with real-world biological facts [22].

  • B)

    Biological information-based studies; Since the topology-based methods overlook the biological information and focus just on the network parameters, they may suffer from high false-positive prediction rates. Therefore, the integration of the information on a network topology with the information on the biological world can yield the most practical outcomes [23]. In this regard, Li et al. proposed a two-step algorithm for identifying the key molecules and examined its performance on the PPI networks. To this end, first, the authors specified the critical nodes of a PPI network using the information on the network topology. Second, they analyzed the discovered key molecules with the gene expression data and filtered them based on the biological facts. The findings showed that the approach derived from the combination of the topological data and the biological information outperformed the other methods in terms of prediction precision/accuracy [24].

  • C)

    Controllability-based studies; such studies aim to discover a minimum number of nodes that can transfer the signal/information of a node of interest to the other nodes in the network [[25], [26], [27], [28]]. The controllability concept was shown to be capable of addressing the restriction resulted from the failure of some of the previous studies to notice the fact that a key molecule can control just a node from among the ones placed at the same distance [25]. Considering this important role of the controllability concept, Ebrahimi et al. introduced a novel mathematical-based theoretical method for selecting the key molecules from a network of interest by minimizing the total number of both the key molecules and mediators at the same time [7,8,29,30]. The researchers designed their method according to the Kalman's controllability rank condition based on which a network is controllable only if its related matrix is full rank.

  • D)

    Machine learning-based studies; in the last category of the literary works, researchers employed the machine learning approaches, especially deep learning techniques, to further improve the process of identifying key molecules [31,32]. For instance, from the theoretical perspective, Muzio et al. depicted the ways by which different machine learning-based approaches can be employed for analyzing the biological networks. To identify the key molecules, a candidate solution was established based on a k-layer graph convolutional network (GCN) in which every node aggregates the input signals (received from the previous layer) and sends the calculated value to an active function. Then, all the nodes of a graph of interest were classified, and the biological functions of every node were specified. Next, the nodes which could control the remaining nodes of a network, were introduced as possible druggable targets through additional analysis of the classes and their related data. The mentioned technique can also be applied for dividing a given graph into smaller parts (a graph classification) and identifying the key molecules using the depicted method [33].

CKDs are defined as kidney function impairments existing for more than three months resulting in different health pathological states [34,35]. Based on reports, the global estimated prevalence of CKD is around 10%, and it is the 12th leading cause of mortality in patients, responsible for 1.1 million deaths across the world [36]. Despite a large number of changeable risk factors, preventing the progression of CKD is still challenging. Patient compliance, racial inequality in treatment, and the lack of effective drug therapies are just a few of the challenges [37]. Therefore, prompt actions are needed to develop novel treatments with the capacity to prevent or delay the progression of CKDs and their undesirable complications in an efficient manner. In this regard, identifying the key molecules in the CKD-related biological networks could be valuable.

In the present study, we applied our previously introduced optimization algorithm, Trader, to identify the key molecules as driver nodes (DNs) in the CKD-related networks and then, merged it with the depth-first search (DFS) algorithm to score the generated candidate solutions [[38], [39], [40], [41]]. Here, the DNs are defined as a minimal subset of the nodes whose removal from the network could produce a maximum number of the disjoint sub-networks. In terms of performance, we also compared the Trader with four popular optimization algorithms [[42], [43], [44], [45], [46], [47]]. Then, for the first time, we proposed a novel concept for selecting the DNs whose removal generated further disjoint sets. Furthermore, we assessed the differentially expressed genes-based PPI networks associated with different CKDs using the Trader algorithm to identify the potential DNs in each network. Next, we checked the druggability of the discovered DNs and introduced the currently known potential targets.

Section snippets

Dataset selection, quality control, analysis, and construction of PPI networks

The selected datasets included samples from diabetic nephropathy (DN), immunoglobulin A nephropathy (IgAN), focal and segmental glomerulosclerosis (FSGS), membranous glomerulonephritis (MGN), minimal change disease (MCD), and hypertensive nephropathy (HN). Principal component analysis (PCA) was utilized to check the quality of the selected datasets. This computing process, which is sensitive to the poor-preprocessing analyses and differences between the conditions, is applicable as a valuable

Dataset analysis

All the necessary information about the selected datasets was presented in Table 1. Among the available CKD-related microarray datasets, six datasets passed the quality assessment tests and were selected for the analysis. The PCA plots of all the datasets were shown in Fig. 7. After the analysis step, the DEGs, associated with each dataset, were mapped to PPI network construction. The analysis results were presented in supplementary file 1 (S1). In the next step, the edge list of each network

Discussion

In the present study, to identify the DNs in the PPI networks, we extended our previous optimization algorithm (Trader) as a novel method. In two distinct sections, we discussed the degree to which the proposed method was beneficial in discovering DNs and proposing the drug targets in the disease-related networks. From the computational perspective, it was shown that the developed discrete optimization algorithm outperformed the four other popular algorithms in terms of generating further

Conclusion

In conclusion, compared to the current methods, the computationally efficient algorithms were shown to generate more practical biological results, and Trader proved to be a valuable tool for exploring and detecting the potential therapeutic targets in the PPI networks of differentially expressed genes in complex diseases.

Funding

This work was funded by the Isfahan University of Medical Sciences (grant #140013).

Author contributions

Amir Roointan and Yosef Masoudi-Sobhanzadeh participated in the design, analysis of the datasets using the algorithms, interpretation of data, and drafting the manuscript. Alieh gholaminejad and Yousof gheisari participated in the study design, interpretation of the data, and drafting of the manuscript. All authors reviewed the manuscript.

Summary

Identifying novel and efficient therapeutic targets, especially for complicated diseases like kidney diseases, is a pressing need for clinical pharmaceutics. It seems that the identification of therapeutic targets in the disease-related protein-protein interaction networks can exert a strong influence on attrition rates in the drug development pipeline. However, more effective and realistic approaches are needed to spot the true targets in such networks. This study was performed to assess the

Declaration of competing interest

The authors declare that they have no conflict of interest.

Acknowledgment

We thank the members of the Regenerative Medicine Research Center for helping us with some parts of the bioinformatic analysis steps.

References (63)

  • A.C. Webster et al.

    Chronic kidney disease

    Lancet

    (2017)
  • G. Dhiman et al.

    Emperor penguin optimizer: a bio-inspired algorithm for engineering problems

    Knowl. Base Syst.

    (2018)
  • Y. Masoudi-Sobhanzadeh et al.

    A novel multi-objective metaheuristic algorithm for protein-peptide docking and benchmarking on the LEADS-PEP dataset

    Comput. Biol. Med.

    (2021)
  • Y. Masoudi-Sobhanzadeh et al.

    World Competitive Contests (WCC) algorithm: a novel intelligent optimization algorithm for biological and non-biological problems

    Inform. Med. Unlocked

    (2016)
  • C. Tang et al.

    P53 in kidney injury and repair: mechanism and therapeutic potentials

    Pharmacol. Therapeut.

    (2019)
  • S. Li et al.

    An iteration model for identifying essential proteins by combining comprehensive PPI network with biological information

    BMC Bioinf.

    (2021)
  • S. Graw et al.

    Multi-omics data integration considerations and study design for biological systems and disease

    Mol. Omics

    (2021)
  • J.G.T. Zañudo et al.

    Structure-based control of complex networks with nonlinear dynamics

    Proc. Natl. Acad. Sci. USA

    (2017)
  • M. Koutrouli et al.

    A guide to conquer the biological network era using graph theory

    Front. Bioeng. Biotechnol.

    (2020)
  • T. Charitou et al.

    Using biological networks to integrate, visualize and analyze genomics data

    Genet. Sel. Evol.

    (2016)
  • E. Ferrero et al.

    In silico prediction of novel therapeutic targets using gene–disease association data

    J. Transl. Med.

    (2017)
  • Z. Shangguan

    A review of target identification strategies for drug discovery: from database to machine-based methods

  • S.Z. Sajadi et al.

    AutoDTI++: deep unsupervised learning for DTI prediction by autoencoders

    BMC Bioinf.

    (2021)
  • H. Ahmed et al.

    Network biology discovers pathogen contact points in host protein-protein interactomes

    Nat. Commun.

    (2018)
  • S. Ali et al.

    Exploring novel key regulators in breast cancer network

    PLoS One

    (2018)
  • M. Kouhsar et al.

    Detection of novel biomarkers for early detection of Non-Muscle-Invasive Bladder Cancer using Competing Endogenous RNA network analysis

    Sci. Rep.

    (2019)
  • X. Liu et al.

    Computational methods for identifying the critical nodes in biological networks

    Briefings Bioinf.

    (2020)
  • W.-F. Guo et al.

    Network control principles for identifying personalized driver genes in cancer

    Briefings Bioinf.

    (2020)
  • M. Zheng et al.

    Visibility graph based temporal community detection with applications in biological time series

    Sci. Rep.

    (2021)
  • Y. Chu et al.

    WDNfinder: a method for minimum driver node set detection and analysis in directed and weighted biological network

    J. Bioinf. Comput. Biol.

    (2017)
  • S. Jung et al.

    The nature of ICT in technology convergence: a knowledge-based network analysis

    PLoS One

    (2021)
  • Cited by (4)

    View full text