Skip to main content
Log in

Comparing multi-objective metaheuristics for solving a three-objective formulation of multiple sequence alignment

  • Regular Paper
  • Published:
Progress in Artificial Intelligence Aims and scope Submit manuscript

Abstract

Multiple sequence alignment (MSA) is an optimization problem consisting in finding the best alignment of more than two biological sequences according to a number of scores or objectives. In this paper, we consider a three-objective formulation of MSA, which includes the STRIKE score, the percentage of aligned columns, and the percentage of non-gap symbols. The two last objectives introduce many plateaus in the search space, thus increasing the complexity of the problem. By taking as benchmark the BAliBASE data set, we carry out a rigorous comparative study by using four multi-objective metaheuristics, including the classical NSGA-II evolutionary algorithm and the more recent ones MOCell, GWASF-GA, and NSGA-III. Our study concludes that NSGA-II provides the best overall performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. jMetalMSA project: https://github.com/jMetal/jMetalMSA.

  2. Maven project: https://maven.apache.org/.

References

  1. Abbasi, M., Paquete, L., Pereira, F.: Local search for multiobjective multiple sequence alignment. In: Ortuño, F., Rojas, I. (eds.) Bioinformatics and Biomedical Engineering, Lecture Notes in Computer Science, vol. 9044, pp. 175–182. Springer, NewYork (2015)

  2. Bacon, D.J., Anderson, W.F.: Multiple sequence alignment. J. Mol. Biol. 191(2), 153–161 (1986)

    Article  Google Scholar 

  3. Berman, H., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T., Weissig, H., Shindyalov, I., Bourne, P.: The protein data bank. Nucleic Acids Res. 28(1), 235–242 (2000)

    Article  Google Scholar 

  4. da Silva, F.J.M., Pérez, J.M.S., Pulido, J.A.G., Rodríguez, M.A.V.: Parallel niche pareto alineaga—an evolutionary multiobjective approach on multiple sequence alignment. J. Integr. Bioinf. 8(3), 174 (2011)

    Google Scholar 

  5. Dayhoff, M., Schwartz, R., B.C. Orcutt, B.: A model of evolutionary change in proteins. In: Atlas of Protein Sequences and Structure 5, 345–352 (1978)

  6. Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002)

    Article  Google Scholar 

  7. Deb, K., Jain, H.: An evolutionary many-objective optimization algorithm using reference-point-based nondominated sorting approach, part I: solving problems with box constraints. IEEE Trans. Evol. Comput. 18(4), 577–601 (2014)

    Article  Google Scholar 

  8. Derrac, J., García, S., Molina, D., Herrera, F.: A practical tutorial on the use of non-parametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms. Swarm Evol. Comput. 1(1), 3–18 (2011)

    Article  Google Scholar 

  9. Durillo, J.J., Nebro, A.J.: jMetal: a java framework for multi-objective optimization. Adv. Eng. Softw. 42(10), 760–771 (2011)

    Article  Google Scholar 

  10. Edgar, R.: Muscle: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32(5), 1792–1797 (2004)

    Article  Google Scholar 

  11. Handl, J., Kell, D., Knowles, J.: Multiobjective optimization in bioinformatics and computational biology. IEEE/ACM Trans. Comput. Biol. Bioinf. 4(2), 279–292 (2007)

    Article  Google Scholar 

  12. Henikoff, S., Henikoff, J.: Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. 89(22), 10915–10919 (1992)

    Article  Google Scholar 

  13. Kaya, M., Sarhan, A., Abdullah, R.: Multiple sequence alignment with affine gap by using multi-objective genetic algorithm. Comput. Methods Progr. Biomed. 114(1), 38–49 (2014)

    Article  Google Scholar 

  14. Kemena, C., Taly, J., Kleinjung, J., Notredame, C.: Strike: evaluation of protein msas using a single 3d structure. Bioinformatics 27(24), 3385–3391 (2011)

    Article  Google Scholar 

  15. Kukkonen, S., Deb, K.: Improved pruning of non-dominated solutions based on crowding distance for bi-objective optimization problems. In: IEEE International Conference on Evolutionary Computation, CEC 2006, part of WCCI 2006, Vancouver, BC, Canada, 16–21 July 2006, pp. 1179–1186 (2006)

  16. Lassmann, T., Sonnhammer, E.L.: Kalign—an accurate and fast multiple sequence alignment algorithm. BMC Bioinf. 6(1), 1–9 (2005)

    Article  Google Scholar 

  17. Nebro, A., Durillo, J.J., Vergne, M.: Redesigning the jMetal multi-objective optimization framework. In: Proceedings of the Companion Publication of the 2015 Annual Conference on Genetic and Evolutionary Computation. GECCO Companion ’15, pp. 1093–1100. ACM, New York, NY (2015)

  18. Nebro, A., Durillo, J., Luna, F., Dorronsoro, B., Alba, E.: Mocell: a cellular genetic algorithm for multiobjective optimization. Int. J. Intell. Syst. 24(7), 723–725 (2009)

    Article  MATH  Google Scholar 

  19. Nebro, A.J., Durillo, J.J., Luna, F., Dorronsoro, B., Alba, E.: Mocell: a cellular genetic algorithm for multiobjective optimization. Int. J. Intell. Syst. 24(7), 723–725 (2009)

    Article  MATH  Google Scholar 

  20. Ortuño, F., Valenzuela, O., Rojas, F., Pomares, H., Florido, J., Urquiza, J., Rojas, I.: Optimizing multiple sequence alignments using a genetic algorithm based on three objectives: structural information, non-gaps percentage and totally conserved columns. Bioinformatics (Oxford, England) 29(17), 2112–2121 (2013)

    Article  Google Scholar 

  21. Rani, R.R., Ramyachitra, D.: Multiple sequence alignment using multi-objective based bacterial foraging optimization algorithm. Biosystems 150, 177–189 (2016)

    Article  Google Scholar 

  22. Rubio-Largo, A., Vega-Rodriguez, M., Gonzalez-Alvarez, D.: A hybrid multiobjective memetic metaheuristic for multiple sequence alignment. IEEE Trans. Evol. Comput. 99, 1–16 (2015)

  23. Rubio-Largo, A., Vega-Rodríguez, M., González-Álvarez, D.: Hybrid multiobjective artificial bee colony for multiple sequence alignment. Appl. Soft Comput. 41, 157–168 (2016)

    Article  Google Scholar 

  24. Saborido, R., Ruiz, A.B., Luque, M.: Global WASF-GA: an evolutionary algorithm in multiobjective optimization to approximate the whole pareto optimal front. Evol. Comput. (2016) (In Press)

  25. Seeluangsawat, P., Chongstitvatana, P.: A multiple objective evolutionary algorithm for multiple sequence alignment. In: Proceedings of the 7th Annual Conference on Genetic and Evolutionary Computation. GECCO ’05, pp. 477–478. ACM, New York, NY (2005)

  26. Sheskin, D.J.: Handbook of Parametric and Nonparametric Statistical Procedures. Chapman & Hall/CRC, Boca Raton (2007)

    MATH  Google Scholar 

  27. Soto, W., Becerra, D.: A multi-objective evolutionary algorithm for improving multiple sequence alignments. In: Campos. S. (ed.) Advances in Bioinformatics and Computational Biology. Lecture Notes in Computer Science, vol. 8826, pp. 73–82. Springer, NewYork (2014)

  28. Thompson, J., Koehl, P., Poch, O.: Balibase 3.0: latest developments of the multiple sequence alignment benchmark. Proteins 61, 127–136 (2005)

    Article  Google Scholar 

  29. Van Walle, I., Lasters, I., Wyns, L.: Sabmarka benchmark for sequence alignment that covers the entire known fold space. Bioinformatics 21(7), 1267–1268 (2005)

    Article  Google Scholar 

  30. Wang, L., Jiang, T.: On the complexity of multiple sequence alignment. J. Comput. Biol. 1(4), 337–348 (1994)

    Article  Google Scholar 

  31. Zhu, H., He, Z., Jia, Y.: A novel approach to multiple sequence alignment using multiobjective evolutionary algorithm based on decomposition. IEEE J. Biomed. Health Inf. 20(2), 717–727 (2016)

    Article  Google Scholar 

  32. Zitzler, E., Thiele, L., Laumanns, M., Fonseca, C.M., Da Fonseca, V.G.: Performance assessment of multiobjective optimizers: an analysis and review. IEEE Trans. Evol. Comput. 7(2), 117–132 (2003)

    Article  Google Scholar 

  33. Zitzler, E., Thiele, L.: Multiobjective evolutionary algorithms: a comparative case study and the strength pareto approach. IEEE Trans. Evol. Comput. 3(4), 257–271 (1999)

    Article  Google Scholar 

Download references

Acknowledgements

The first author acknowledges Universidad Técnica Estatal de Quevedo (Ecuador) for supporting his doctoral stays at Departamento de Lenguajes y Ciencias de la Computación of Universidad de Málaga (Spain).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Antonio J. Nebro.

Additional information

This work has been partially funded by the Secretaría Nacional de Educación Superior Ciencia y Tecnología SENESCYT from Ecuador, and Spanish Grants TIN2014-58304-R (Spanish Ministry of Education and Science) P11-TIC-7529 (Innovation, Science and Enterprise Ministry of the regional government of the Junta de Andalucía) and P12-TIC-1519 (Plan Andaluz de Investigación, Desarrollo e Innovación). José García-Nieto is recipient of a Post-Doctoral fellowship of “Captación de Talento para la Investigación” at Universidad de Málaga.

Appendix 1: Supplementary results

Appendix 1: Supplementary results

The following tables contain the medians and interquartile range of resulting distributions of \(I_{e+}\) and \(I_{\mathrm{IGD+}}\) (out of 25 independent runs), for each compared algorithm and family of MSA instances. A total number of 140 different MSA instances from 5 problem families (RV-11, RV-12, RV-30, RV-40, and RV-50) have been tackled, which lead us to suggest that reported conclusions are unbiased and general enough.

Table 3 Median and interquartile range of \(I_{\mathrm{IGD+}}\) for MSA problem family RV11 and for algorithms: GWASF-GA, NSGA-II, MOCell, and NSGA-III
Table 4 Median and interquartile range of \(I_{e+}\) for MSA problem family RV12 and for algorithms: GWASF-GA, NSGA-II, MOCell, and NSGA-III
Table 5 Median and interquartile range of \(I_{\mathrm{IGD+}}\) for MSA problem family RV12 and for algorithms: GWASF-GA, NSGA-II, MOCell, and NSGA-III
Table 6 Median and interquartile range of \(I_{e+}\) for MSA problem family RV30 and for algorithms: GWASF-GA, NSGA-II, MOCell, and NSGA-III
Table 7 Median and interquartile range of \(I_{\mathrm{IGD+}}\) for MSA problem family RV30 and for algorithms: GWASF-GA, NSGA-II, MOCell, and NSGA-III
Table 8 Median and interquartile range of \(I_{e+}\) for MSA problem family RV40 and for algorithms: GWASF-GA, NSGA-II, MOCell, and NSGA-III
Table 9 Median and interquartile range of \(I_{\mathrm{IGD+}}\) for MSA problem family RV40 and for algorithms: GWASF-GA, NSGA-II, MOCell, and NSGA-III
Table 10 Median and interquartile range of \(I_{e+}\) for MSA problem family RV50 and for algorithms: GWASF-GA, NSGA-II, MOCell, and NSGA-III
Table 11 Median and interquartile range of \(I_{\mathrm{IGD+}}\) for MSA problem family RV50 and for algorithms: GWASF-GA, NSGA-II, MOCell, and NSGA-III

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zambrano-Vega, C., Nebro, A.J., García-Nieto, J. et al. Comparing multi-objective metaheuristics for solving a three-objective formulation of multiple sequence alignment. Prog Artif Intell 6, 195–210 (2017). https://doi.org/10.1007/s13748-017-0116-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13748-017-0116-6

Keywords

Navigation