Skip to main content
Log in

A multi-objective based PSO approach for inferring pathway activity utilizing protein interactions

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The pathway information of a given microarray gene expression data can be collected from the available public databases. Inferring the activity of a pathway is a crucial task in functional genomics. In general, the set of genes that are associated with a given pathway are equally considered for measuring goodness. But the contribution of each gene should be quantified differently. In the current study, we have quantified the degrees of relevance of different genes participating in a pathway by optimizing different goodness measures of pathway activity. Two popular goodness measures, namely t-score and z-score are modified to measure the goodness of the weighted gene vectors. Moreover, another goodness measure based on the protein-protein interaction scores of pairs of genes participated in a pathway is utilized as another objective function. All these measures are designed to handle the weighted importance of individual genes. The search capability of a multiobjective based particle swarm optimization (PSO) is utilized for searching the appropriate relevance vectors for different genes. The proposed approach is applied to five real-life gene expression datasets, and the performance is compared with eight existing feature selection methods. The comparative results demonstrate the superiority of the proposed particle swarm optimization based technique. The efficacy of the performance of the proposed method is validated by using a statistical significance test, and further, a biological significant test is done to justify the biological relevance of the extracted pathway-based gene markers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Aho KA, Foundational and applied statistics for biologists using R. Chapman and Hall/CRC (2016)

  2. An FP, Liu ZW (2019) Bi-dimensional empirical mode decomposition (bemd) algorithm based on particle swarm optimization-fractal interpolation. Multimed Tools Appl 78(12):17239–17264

    Article  Google Scholar 

  3. Baldi P, Long AD (2001) A bayesian framework for the analysis of microarray expression data: regularized t-test and statistical inferences of gene changes. Bioinformatics 17(6):509–519

    Article  Google Scholar 

  4. Bandyopadhyay S, Mallik S, Mukhopadhyay A (2014) A survey and comparative study of statistical tests for identifying differential expression from microarray data. IEEE/ACM Trans Comput Biol Bioinform 11(1):95–115

    Article  Google Scholar 

  5. Bandyopadhyay S, Saha S, Maulik U, Deb K (2008) A simulated annealing-based multiobjective optimization algorithm: Amosa. IEEE Trans Evolut Comput 12(3):269–283

    Article  Google Scholar 

  6. Borawake-Satao R, Prasad R (2019) Mobility aware multi-objective routing in wireless multimedia sensor network. Multimed Tools Appl 78 (23):32659–32677

    Article  Google Scholar 

  7. Chakraborty R, Sushil R, Garg M (2019) Hyper-spectral image segmentation using an improved pso aided with multilevel fuzzy entropy. Multimed Tools Appl 78(23):34027–34063

    Article  Google Scholar 

  8. Coordinators NR (2013) Database resources of the national center for biotechnology information. Nucleic acids research 41(Database issue):D8

  9. Daneshfar F, Kabudian SJ (2019) Speech emotion recognition using discriminative dimension reduction by employing a modified quantum-behaved particle swarm optimization algorithm. Multimedia Tools and Applications, 1–29

  10. Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE Trans Evol Comput 6 (2):182–197

    Article  Google Scholar 

  11. Deng L, Pei J, Ma J, Lee DL (2004) A rank sum test method for informative gene discovery. In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 410–419

  12. Ding C, Peng H (2005) Minimum redundancy feature selection from microarray gene expression data. J Bioinf Comput Biol 3(02):185–205

    Article  Google Scholar 

  13. Dutta P, Saha S (2017) Fusion of expression values and protein interaction information using multi-objective optimization for improving gene clustering. Comput Biol Med 89:31–43

    Article  Google Scholar 

  14. Dutta P, Saha S, Chauhan AB (2018) Predicting degree of relevance of pathway markers from gene expression data: A pso based approach. In: International conference on neural information processing. Springer, Berlin, pp 3–14

  15. Dutta P, Saha S, Chopra S, Miglani V (2019) Ensembling of gene clusters utilizing deep learning and protein-protein interaction information. IEEE/ACM transactions on computational biology and bioinformatics

  16. Dutta P, Saha S, Gulati S (2019) Graph-based hub gene selection technique using protein interaction information: application to sample classification. IEEE J Biomed Health Inform 23(6):2670–2676

    Article  Google Scholar 

  17. Dutta P, Saha S, Pai S, Kumar A (2020) A protein interaction information-based generative model for enhancing gene clustering. Sci Rep 10(1):1–12

    Article  Google Scholar 

  18. El Aziz MA, Ewees AA, Hassanien AE (2018) Multi-objective whale optimization algorithm for content-based image retrieval. Multimedi Tools Appl 77(19):26135–26172

    Article  Google Scholar 

  19. Fox RJ, Dimmic MW (2006) A two-sample bayesian t-test for microarray data. BMC Bioinform 7(1):126

    Article  Google Scholar 

  20. Gupta DK, Reddy KS, Ekbal A, et al. (2015) Pso-asent: feature selection using particle swarm optimization for aspect based sentiment analysis. In: International conference on applications of natural language to information systems. Springer, Berlin, pp 220–233

  21. Hall MA, Smith LA (1999) Feature selection for machine learning: comparing a correlation-based filter approach to the wrapper. In: FLAIRS conference, vol 1999, pp 235–239

  22. Huang DW, Sherman BT, Lempicki RA (2009) Systematic and integrative analysis of large gene lists using david bioinformatics resources. Nature Protoc 4(1):44

    Article  Google Scholar 

  23. Jiang H, Deng Y, Chen HS, Tao L, Sha Q, Chen J, Tsai CJ, Zhang S (2004) Joint analysis of two microarray gene-expression data sets to select lung adenocarcinoma marker genes. BMC Bioinform 5(1):81

    Article  Google Scholar 

  24. Kamandar M, Ghassemian H (2011) Maximum relevance, minimum redundancy band selection for hyperspectral images. In: Electrical engineering (ICEE), 2011 19th iranian conference on, IEEE, pp 1–5

  25. Kanehisa M, Goto S (2000) Kegg: kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28(1):27–30

    Article  Google Scholar 

  26. Kennedy J (2011) Particle swarm optimization. In: Encyclopedia of machine learning. Springer, Berlin, pp 760–766

  27. Kushwaha N, Pant M (2019) Modified particle swarm optimization for multimodal functions and its application. Multimed Tools Appl 78(17):23917–23947

    Article  Google Scholar 

  28. Lee E, Chuang HY, Kim JW, Ideker T, Lee D (2008) Inferring pathway activity toward precise disease classification. PLoS Comput Biol 4(11):e1000217

    Article  Google Scholar 

  29. Liu KQ, Liu ZP, Hao JK, Chen L, Zhao XM (2012) Identifying dysregulated pathways in cancers from pathway interaction networks. BMC Bioinform 13(1):126

    Article  Google Scholar 

  30. Liu W, Wang W, Tian G, Xie W, Lei L, Liu J, Huang W, Xu L, Li E (2017) Topologically inferring pathway activity for precise survival outcome prediction: breast cancer as a case. Mol Biosyst 13(3):537–548

    Article  Google Scholar 

  31. López Y, Nakai K, Patil A (2015) Hitpredict version 4: comprehensive reliability scoring of physical protein–protein interactions from more than 100 species. Database 2015

  32. Mandal M, Mondal J, Mukhopadhyay A (2015) A pso-based approach for pathway marker identification from gene expression data. IEEE Trans Nanobiosci 14(6):591–597

    Article  Google Scholar 

  33. Mandal M, Mukhopadhyay A (2014) A graph-theoretic approach for identifying non-redundant and relevant gene markers from microarray data using multiobjective binary pso. PloS one 9(3):e90949

    Article  Google Scholar 

  34. Marcano-Cedeño A, Quintanilla-Domínguez J, Cortina-Januchs M, Andina D (2010) Feature selection using sequential forward selection and classification applying artificial metaplasticity neural network. In: IECON 2010-36th annual conference on IEEE industrial electronics society, IEEE, pp 2845–2850

  35. Maulik U, Saha I (2009) Modified differential evolution based fuzzy clustering for pixel classification in remote sensing imagery. Pattern Recogn 42 (9):2135–2149

    Article  Google Scholar 

  36. Mendenhall WM (2016) Statistics for engineering and the sciences, student solutions manual. Chapman and Hall/CRC, Boudreau, NS

  37. Mukherjee S, Roberts SJ, Sykacek P, Gurr SJ (2003) Gene ranking using bootstrapped p-values. ACM SIGKDD Explor Newsletter 5(2):16–22

    Article  Google Scholar 

  38. Mukhopadhyay A, Mandal M (2014) Identifying non-redundant gene markers from microarray data: a multiobjective variable length pso-based approach. IEEE/ACM Trans Comput Biol Bioinform (TCBB) 11(6):1170–1183

    Article  Google Scholar 

  39. Parsopoulos KE (2010) Particle swarm optimization and intelligence: advances and applications: advances and applications. IGI global

  40. Piñero J, Bravo À, Queralt-Rosinach N, Gutiérrez-Sacristán A, Deu-Pons J, Centeno E, García-García J, Sanz F, Furlong LI (2016) Disgenet: a comprehensive platform integrating information on human disease-associated genes and variants. Nucleic acids research p gkw943

  41. Poli R, Kennedy J, Blackwell T (2007) Particle swarm optimization. Swarm Intell 1(1):33–57

    Article  Google Scholar 

  42. Seo M, Oh S (2012) Cbfs: High performance feature selection algorithm based on feature clearness. PloS one 7(7):e40419

    Article  Google Scholar 

  43. Sethi R, Sreedevi I (2019) Adaptive enhancement of underwater images using multi-objective pso. Multimeda Tools Appl 78(22):31823–31845

    Article  Google Scholar 

  44. Shipp MA, Ross KN, Tamayo P, Weng AP, Kutok JL, Aguiar RC, Gaasenbeek M, Angelo M, Reich M, Pinkus GS, et al. (2002) Diffuse large b-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nat Med 8(1):68

    Article  Google Scholar 

  45. Su J, Yoon BJ, Dougherty ER (2010) Identification of diagnostic subnetwork markers for cancer in human protein-protein interaction network. In: BMC Bioinformatics, biomed central, vol 11, p S8

  46. Troyanskaya OG, Garber ME, Brown PO, Botstein D, Altman RB (2002) Nonparametric methods for identifying differentially expressed genes in microarray data. Bioinformatics 18(11):1454–1461

    Article  Google Scholar 

  47. Wang K, Li M, Bucan M (2007) Pathway-based approaches for analysis of genomewide association studies. Am J Hum Genet 81(6):1278–1283

    Article  Google Scholar 

  48. Wang Y, Makedon FS, Ford JC, Pearlman J (2004) Hykgene: a hybrid approach for selecting marker genes for phenotype classification using microarray gene expression data. Bioinformatics 21(8):1530–1537

    Article  Google Scholar 

  49. Wang X, Sun Z, Zimmermann MT, Bugrim A, Kocher JP (2019) Predict drug sensitivity of cancer cells with pathway activity inference. BMC Med Genomics 12(1):15

    Article  Google Scholar 

  50. Welch BL (1947) The generalization of ‘student’s’ problem when several different population variances are involved, vol 34. http://www.jstor.org/stable/2332510

  51. Xiao Y, Hsiao TH, Suresh U, Chen HIH, Wu X, Wolf SE, Chen Y (2012) A novel significance score for gene selection and ranking. Bioinformatics 30(6):801–807

    Article  Google Scholar 

Download references

Acknowledgments

Pratik Dutta acknowledges Visvesvaraya PhD Scheme for Electronics and IT, an initiative of Ministry of Electronics and Information Technology (MeitY), Government of India for fellowship support. Dr. Sriparna Saha gratefully acknowledges the Young Faculty Research Fellowship (YFRF) Award, supported by Visvesvaraya PhD scheme for Electronics and IT, Ministry of Electronics and Information Technology (MeitY), Government of India, being implemented by Digital India Corporation (formerly Media Lab Asia) for carrying out this research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sriparna Saha.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Availability of data and materials

https://github.com/duttaprat/PPI-based-MOPSO.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dutta, P., Saha, S. & Naskar, S. A multi-objective based PSO approach for inferring pathway activity utilizing protein interactions. Multimed Tools Appl 80, 30283–30303 (2021). https://doi.org/10.1007/s11042-020-09269-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-09269-8

Keywords

Navigation