EditorialDiscovery of pathways in protein–protein interaction networks using a genetic algorithm
Introduction
Recently, there is a great interest in protein–protein interaction (PPI) databases, the source of interaction information for case studies in bioinformatics, being aggregated over time from the experimental findings. Given the large amount of PPI data collected, a challenging problem is to get biological insights, in particular to discover biological pathways from the data. Note that edges representing PPIs have been experimentally defined and tested. Certainly the reconstruction of the biological processes of cell (pathway or networks) has attracted a lot of attentions: the reconstruction of regulatory networks [1], [2], [3], [4], [5], the analysis of metabolic networks [6], [7], [8], [9], and the discovery of signaling networks and pathways [10], [11], [12], [13]. However, directionality of interactions in networks has not been thoroughly investigated, while direction is essential in finding how information is moved from one to another. The orientation of the signaling network is more difficult than the regulatory and metabolic networks, due to the lack of orientation information. For example, orientation of gene regulatory network is often determined by transcription factors regulating genes, studies of microRNAs often look for targets and motif studies are implemented upstream of genes [14], [15], [16]. Similarly, metabolic networks are modeled by knowledge about the order of genes and enzymes [17]. Meanwhile, it is a fact that PPI data is almost always undirected; therefore the problem of orienting interaction edges for signal transmission in signaling networks is costly. Typical works in this area can be found in [12], [18], [19] underlining the need for finding an efficient algorithm for edge-orientation in PPI networks, which has been identified as an NP-hard problem.
In [12], the authors presented a random orientation (plus local search) algorithm (ROLS) to perform edge orientation and evaluated calculated results with the data from biological experiments in order to determine if the path found was consistent with the experimental or not. The results were also compared with several algorithms proposed in [20], [21], [22]. When evaluating the algorithm results, the authors found out 37 standard pathways that had been tested through biological experiments. But there were still paths that did not appear in the standard set and such interactions could not occur in biological experiments, even though the objective function values of these pathways were high.
In the framework of this paper, we extended further our preliminary results on PPI edge orienting [23]. In particular, we designed a genetic algorithm (GA) for it. GA is one of the popular and successful computational models in the field of intelligent computing [24], especially for dealing with NP-hard problems. Along with other intelligent computing techniques such as fuzzy computing, neural networks and multi-agent systems, GAs develop more and more strongly and are widely applied in different fields [25], [26]. Our GA design takes into account conflicting elements in PPI networks in order to reduce unnecessary edges, thus greatly improves computing speed. We examined different aspects including running time and objective values. Results showed that our algorithm found a good solution for the problem and this finding was supported by comparison to other algorithms' results. Especially, we answered the question of what is the meaning of the obtained pathways by extending biological validation.
The structure of our paper consists of 5 sections: Section 1 introduces the problem, Section 2 gives general knowledge of the problem and the GA, Section 3 describes in detail the GA algorithm designed to solve the problem posed, Section 4 presents actual experimental data on PPIs of yeast and make an assessment of the results achieved by the algorithm. The final part is the paper conclusion.
Section snippets
Problem of orienting edges in protein interaction networks
Proteins are important components in the cell 's structure. They are involved in most of biological processes. During cell functioning, they interact with each other or with macromolecules such as DNA and RNA. They together form a complex network of interactions to perform biological functions. An example is given in Fig. 1 where the graph shows a part of the network of protein interactions in yeast created by the Cytospase software. From the graph, we can see that the protein interaction
Methodology
The main idea is to design a GA to tailor the orientation problem characteristics making the search process effective. It starts with a randomly initialized population (population P) of individuals in which the number of individuals of the population is a constant natural number n, each individual is represented by the sequence of the chromosomes. Population will be evolved over many generations. The best individual of each generation is kept for the next population and we apply the local
Yeast's interaction network
For experiments, we used the database of yeast PPIs taken from database BioGRID (http://thebiogrid.org). This is an online database of genetic interactions of organisms on a large scale. As mentioned above, this database is extensively updated over time basing on new researches and findings by experiments from biologists. Therefore, for ease of comparison between our results and those of existing algorithms, we use the same database version 2.0.51 BioGrid with the authors [29]. This database is
Conclusion
In this paper, we proposed the GA design for problem of orienting protein interaction network. This is a challenging problem for computational biology. We presented a method to perform population individuals that fit the problem, especially that our designs take into account conflicting elements for solution representation, thus greatly improving computing speed. Results show that our algorithm properly settles this problem. As evidence of the correctness of our algorithm, we find that our
References (37)
- et al.
Metabolic reconstruction, constraint-based analysis and game theory to probe genome-scale metabolic networks
Curr. Opin. Biotechnol.
(2010) - et al.
Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets
Cell
(2005) - et al.
Genetically constrained metabolic flux analysis
Metab. Eng.
(2005) - et al.
Structure of morphologically expanded queries: a genetic algorithm approach
Data Knowl. Eng.
(2010) - et al.
Sentence identification of biological interactions using Patricia tree generated patterns and genetic algorithm optimized parameters
Data Knowl. Eng.
(2010) - et al.
Genetic algorithms for approximate similarity queries
Data Knowl. Eng.
(2007) A walk-through of the yeast mating pheromone response pathway
Peptides
(2004)- et al.
Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data
Nat. Genet.
(2003) - et al.
Aracne: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context
BMC Bioinf.
(2006) - et al.
Improvements in the reconstruction of time-varying gene regulatory networks: dynamic programming and regularization by information sharing among genes
Bioinformatics
(2011)
Genomic reconstruction of transcriptional regulatory networks in lactic acid bacteria
BMC Genomics
Reconstruction of gene regulatory networks based on two-stage Bayesian network structure learning algorithm
J. Bionic Eng.
Identifying metabolic pathways and gene regulation networks with evolutionary algorithms
Evol. Comput. Bioinforma.
Large-scale in vivo flux analysis shows rigidity and suboptimal performance of bacillus subtilis metabolism
Nat. Genet.
Efficient algorithms for detecting signaling pathways in protein interaction networks
J. Comput. Biol.
Pathfinder: mining signal transduction pathway segments from protein–protein interaction networks
BMC Bioinf.
Discovering pathways by orienting edges in protein interaction networks
Nucleic Acids Res.
Cited by (6)
Network biology and applications
2021, Bioinformatics: Methods and ApplicationsOrienting Conflicted Graph Edges Using Genetic Algorithms to Discover Pathways in Protein-Protein Interaction Networks
2021, IEEE/ACM Transactions on Computational Biology and BioinformaticsMGT-SM: A method for constructing cellular signal transduction networks
2019, IEEE/ACM Transactions on Computational Biology and BioinformaticsAn overview of bioinformatics methods for modeling biological pathways in yeast
2016, Briefings in Functional Genomics