Skip to main content
Log in

Parallel multi-swarm optimizer for gene selection in DNA microarrays

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

The execution of many computational steps per time unit typical of parallel computers offers an important benefit in reducing the computing time in real world applications. In this work, a parallel Particle Swarm Optimization (PSO) is used for gene selection of high dimensional Microarray datasets. The proposed algorithm, called PMSO, consists of running a set of independent PSOs following an island model, where a migration policy exchanges solutions with a certain frequency. A feature selection mechanism is embedded in each subalgorithm for finding small samples of informative genes amongst thousands of them. PMSO has been experimentally assessed with different population structures on four well-known cancer datasets. The contributions are twofold: our parallel approach is able to improve sequential algorithms in terms of computational time/effort (Efficiency of 85%), as well as in terms of accuracy rate, identifying specific genes that our work suggests as significant ones for an accurate classification.

Additional comparisons with several recent state the of art methods also show competitive results with improvements of over 100% in the classification rate and very few genes per subset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Alba E (2002) Parallel evolutionary algorithms can achieve super-linear performance. Inf Process Lett 82(1):7–13

    Article  MathSciNet  MATH  Google Scholar 

  2. Alba E (2005) Parallel metaheuristics: a new class of algorithms. Wiley series on parallel and distributed computing. Wiley, New York

    Book  MATH  Google Scholar 

  3. Alba E, Dorronsoro B (2008) Cellular genetic algorithms. Springer, Berlin

    MATH  Google Scholar 

  4. Alba E, Luque G (2005) Parallel metaheuristics. A new class of algorithms. In: Measuring the performance of parallel metaheuristics. Wiley series on parallel and distributed computing. Wiley, New York, pp 43–62. Chap 2

    Google Scholar 

  5. Alba E, Troya JM (2001) Analyzing synchronous and asynchronous parallel distributed genetic algorithms. Future Gener Comput Syst 17(4):451–465

    Article  MATH  Google Scholar 

  6. Alba E, García-Nieto J, Jourdan L, Talbi E-G (2007) Gene selection in cancer classification using PSO/SVM and GA/SVM hybrid algorithms. In: IEEE congress on evolutionary computation CEC-07, Singapore, Sep 2007, pp 284–290

    Chapter  Google Scholar 

  7. Alba E, Luque G, García-Nieto J, Ordonez G, Leguizamón G (2007) MALLBA: a software library to design efficient optimisation algorithms. Int J Innov Comput Appl 1(1):74–85

    Article  Google Scholar 

  8. Alizadeh A.A (2000) Distinct types of diffuse large b-cell lymphoma identified by gene expression profiling. Nature 11:403–503

    Google Scholar 

  9. Alon U, Barkai N, Notterman D, Gish K, Ybarra S, Mack D, Levine AJ (1999) Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci USA 96:6745–6750

    Article  Google Scholar 

  10. Chang C-C, Lin C-J (2002) LIBSVM: a library for support vector machines

  11. Cho S, Won H (2007) Cancer classification using ensemble of neural networks with multiple significant gene subsets. Appl Intell 26:243–250

    Article  MATH  Google Scholar 

  12. Clerc M (2005) Binary particle swarm optimisers: Toolbox, derivations, and mathematical insights

  13. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297

    MATH  Google Scholar 

  14. Draminski M, Rada-Iglesias A, Enroth S, Wadelius C, Koronacki J, Komorowski J (2008) Monte Carlo feature selection for supervised classification. Bioinformatics 24(1):110–117

    Article  Google Scholar 

  15. Fix E, Hodges JL (1951) Nonparametric discrimination: consistency properties. Technical report, 4, US Air Force School of Aviation Medicine, R Field, TX

  16. Golub R, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA, Bloomfield CD, Lander ES (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286:531–537

    Article  Google Scholar 

  17. Gordon GJ, Jensen RV, Hsiao L-L, Gullans SR, Blumenstock JE, Ramaswamy S, Richards WG (2002) Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. Cancer Res 62:4963–4967

    Google Scholar 

  18. Guyon I, Weston J, Barnhill S, Vapnik V (2002) Gene selection for cancer classification using support vector machines. Mach Learn 46(1–3):389–422

    Article  MATH  Google Scholar 

  19. Hernandez J, Duval B, Hao J-K (2007) A genetic embedded approach for gene selection and classification of microarray data. In: Marchiori E et al (eds) LNCS of EvoBio, pp 90–101

    Google Scholar 

  20. Huerta EB, Duval B, Hao J-K (2006) A Hybrid GA SVM approach for gene selection and classification of microarray data. In: Rothlauf F et al (eds) LNCS of EvoWorkshops, vol 3907. Springer, Berlin, pp 34–44

    Google Scholar 

  21. Juliusdottir T, Keedwell E, Corne D, Narayanan A (2005) Two-phase EA/K-NN for feature selection and classification in cancer microarray datasets. In: Comp int in bioinformatics and computational biology, pp 1–8

    Google Scholar 

  22. Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proc of the IEEE international conference on neural networks, vol 4, pp 1942–1948

    Chapter  Google Scholar 

  23. Kennedy J, Eberhart R (1997) A discrete binary version of the particle swarm algorithm. In: Proceedings of the IEEE international conference on systems, man and cybernetics, vol 5, pp 4104–4109

    Google Scholar 

  24. Kohavi J, John GH (1998) The wrapper approach. In: Feature selection for knowledge discovery and data mining, pp 33–50

    Google Scholar 

  25. Liu J, Iba H (2002) Selecting informative genes using a multiobjective evolutionary algorithm. In: Proceedings of the IEEE congress on evolutionary computation, CEC’02, May 2002, vol 1, pp 297–302

    Google Scholar 

  26. Liu B, Cui Q, Jiang T, Ma S (2004) A combinational feature selection and ensemble neural network method for classification of gene expression data. BMC Bioinform 5:136–148

    Article  Google Scholar 

  27. Moraglio A, Di Chio C, Poli R (2007) Geometric particle swarm optimization. In: 10th European conference on genetic programming (EuroGP 2007). Lecture notes in computer science, vol 4445. Springer, Berlin

    Google Scholar 

  28. Narendra M, Fukunaga K (1977) A branch and bound algorithm for feature subset selection. IEEE Trans Comput 26:917–922

    Article  MATH  Google Scholar 

  29. Pease AC, Solas D, Sullivan E, Cronin M, Holmes CP, Fodor S (1994) Light-generated oligonucleotide arrays for rapid dna sequence analysis. In: Proc natl acad sci, vol 96., pp 5022–5026

    Google Scholar 

  30. Romdhane L, Shili H, Ayeb B (2010) Mining microarray gene expression data with unsupervised possibilistic clustering and proximity graphs. Appl Intell 33:220–231

    Article  Google Scholar 

  31. Salto C, Alba E (In press) Designing heterogeneous distributed GAs by efficiently self-adapting the migration period. Appl Intell (Online first). doi:10.1007/s10489-011-0297-9

  32. Verma B, Hassan SZ (2010) Hybrid ensemble approach for classification. Appl Intell 34(2):258–278

    Article  Google Scholar 

  33. Vinh L, Lee S, Park Y, dÁuriol B (In press) A novel feature selection method based on normalized mutual information. Appl Intell (Online first). doi:10.1007/s10489-011-0315-y

  34. Wang S, Zhu J (2007) Improved centroids estimation for the nearest shrunken centroid classifier. Bioinformatics 32(2):972–979

    Article  Google Scholar 

  35. Zhu H, Jiao L, Pan J (2006) Multi-population genetic algorithm for feature selection. In: ICNC (2), pp 480–487

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to José García-Nieto.

Rights and permissions

Reprints and permissions

About this article

Cite this article

García-Nieto, J., Alba, E. Parallel multi-swarm optimizer for gene selection in DNA microarrays. Appl Intell 37, 255–266 (2012). https://doi.org/10.1007/s10489-011-0325-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-011-0325-9

Keywords

Navigation