Abstract
Novel genetic algorithm (GA)-based strategies, specifically aimed at multimodal optimization problems, have been developed by hybridizing the GA with alternative optimization heuristics, and used for the search of a maximal number of minimum energy conformations (geometries) of complex molecules (conformational sampling). Intramolecular energy, the targeted function, describes a very complex nonlinear response hypersurface in the phase space of structural degrees of freedom. These are the torsional angles controlling the relative rotation of fragments connected by covalent bonds. The energy surface of cyclodextrine, a macrocyclic sugar molecule with N = 65 degrees of freedom served as model system for testing and tuning the herein proposed multimodal optimization strategies. The success of GAs is known to depend on the peculiar hypotheses used to simulate Darwinian evolution. Therefore, the conformational sampling GA (CSGA) was designed such as to allow an extensive control on the evolution process by means of tunable parameters, some being classical GA controls (population size, mutation frequency, etc.), while others control the herein designed population diversity management tools or the frequencies of calls to the alternative heuristics. They form a large set of operational parameters, and a (genetic) meta-optimization procedure was used to search for parameter configurations maximizing the efficiency of the CSGA process. The specific impact of disabling a given hybridizing heuristics was estimated relatively to the default sampling behavior (with all the implemented heuristics on). Optimal sampling performance was obtained with a GA featuring a built-in tabu search mechanism, a “Lamarckian” (gradient-based) optimization tool, and, most notably, a “directed mutations” engine (a torsional angle driving procedure generating chromosomes that radically differ from their parents but have good chances to be “fit”, unlike offspring from spontaneous mutations). “Biasing” heuristics, implementing some more elaborated random draw distribution laws instead of the ‘flat’ default rule for torsional angle value picking, were at best unconvincing or outright harmful. Naive Bayesian analysis was employed in order to estimated the impact of the operational parameters on the CSGA success. The study emphasized the importance of proper tuning of the CSGA. The meta-optimization procedure implicitly ensures the management, in the context of an evolving operational parameterization, of the repeated GA runs that are absolutely mandatory for the reproducibility of the sampling of such vast phase spaces. Therefore, it should not be only seen as a tuning tool, but as the strategy for actual problem solving, essentially advocating a parallel exploration of problem space and parameter space.
Similar content being viewed by others
Abbreviations
- GA:
-
Genetic algorithm
- CSGA:
-
Conformational sampling GA
- μGA:
-
Meta-GA (used for parameter setup optimization)
- μF:
-
Meta-fitness score (target function of the μGA) a measure of success of conformational sampling
References
Bäck T (1996) Evolutionary algorithms in theory and practice. Oxford University Press, Oxford
Brunger AT, Clore GM, Gronenborn AM, Saffrich R, Nilges M (1993) Assessing the quality of solution nuclear magnetic resonance structures by complete cross-validation. Science 261: 328–331
Calland PY (2003) On the structural complexity of a protein. Protein Eng 16:79–86
Damsbo M et al (2004) Application of evolutionary algorithm methods to polypeptidic folding: comparison with experimental results for unsolvated Ac-(Ala-Gly-Gly)5-LysH+. Proc Natl Acad Sci USA 101:7215–7222
Davy M, Del Moral P, Doucet A (2003) Méthodes Monte Carlo Séquentielles pour l’analyse Spectrale Bayésienne, Proceeding of the GRETSI Conference, Paris
De Jong KA, Potter MA, Spears WM (1997) Using a problem generator to explore the effects of epistasis. In: Proceedings of the 7th international conference on genetic algorithms. Morgan Kaufmann, San Fransisco, pp 338–345
De Jong KA, Spears WM, Gordon DF (1994) Using Markov chains to analyse GAFOs. In: Foundations of genetic algorithms 94, Morgan Kaufmann, San Fransisco, pp 115–137
Del Moral P, Doucet A (2002) Sequential Monte Carlo samplers. technical report 443, Cambridge University Press, Cambridge
Discover simulation package, Accelrys, San Diego, CA, http://www.accelrys.com/insight/discover.html
Glen WG, Dunn WJ, Scott DR (1989) Principal components analysis and partial least squares regressions. Tetrahedron Comput Technol 2:349–376
Glover F (1989) Tabu Search, Part I. ORSA J Comput 1(3):190–206
Glover F (1990) Tabu Search, Part II. ORSA J Comput 2(1):4–32
Goldberg DE (1989) Genetic algorithms in Search, optimization and machine learning. Addison-Wesley, Reading
Goto H, Osawa E (1993) An efficient algorithm for searching low-energy conformers of cyclic and acyclic molecules. J Chem Soc Perkin Trans 2:187–198
Grefenstette JJ (1986) Optimisation of control parameters for genetic algorithms. IEEE Trans SMC 16:122–128
Hagler AT, Huler E, Lifson S (1974) Energy functions for peptides and proteins: I. Derivation of a consistent force field including the hydrogen bond from amide crystals. J Am Chem Soc 96: 5319–5327
Hart WE, Belew RK (1991) Optimizing an arbitrary function is hard for the genetic algorithm. In: Booker LB (eds) Proceedings of the 4th international conference on the genetic algorithms. Morgan Kaaufmann, San Mateo, pp 190–195
Herrera F, Lozano M (2001) Adaptative genetic operators based on coevolution with fuzzy behaviors. IEEE Trans Evol Comput 2:149–165
Heudin JC (1994) La vie artificielle. Hermès Editions, Paris
Hornak V, Simmerling C (2003) Generation of accurate protein loop conformations through low-barrier molecular dynamics. Proteins 51:577–590
Horvath D (1997) A virtual screening approach applied to the search of trypanothione reductase inhibitors. J Med Chem 15:2412–2423
Horvath D, Jeandenans C (2003) Neighborhood behavior of in silico structural spaces with respect to in vitro activity spaces – a novel understanding of the molecular similarity principle in the context of multiple receptor binding profiles. J Chem Inf Comp Sci 43:680–690
Jarvis BB (2002) http://www.chem.umd.edu/courses/jarvis/chem 233spr04/Chapter04Notes.pdf
Kolossvary I, Guida WC (1996) Low mode search. An efficient, automated computational method for conformational analysis: Application to cyclic and acyclic alkanes and cyclic peptides. J Am Chem Soc 118:5011–5019
Kubota N, Fukuda T (1997) Genetic algorithms with age structure. Soft Comput 1:155–161
Michalewicz Z (1994) Genetic algorithms + data structure = evolution programs, 2nd edn. Springer, Berlin Heidelberg New York
Morris GM, Goodsell DS, Halliday RS, Huey R, Hart WE, Belew RE, Olson AJ (1998) Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function. J Comp Chem 19:1639–1662
Ochoa G, Harvey J, Buxton H (1999) On recombination and Optimal Mutation Rates. In: Proceedings of genetic and evolutionary computation conference (GECCO-99), Morgan Kaufmann, San Francisco, pp 488–495
Packer MJ, Hunter CA (2001) Sequence-structure relationships in DNA oligomers: a computational approach. J Am Chem Soc 123:7399–7406
Pipeline Pilot version 3.0, available from SciTegic, Inc, at http://www.scitegic.com
Prebys EK (1999) The genetic algorithm in computer science. MIT Undergraduate J Math 1:165–170
Renders JM (1995) Algorithmes Génétiques et Réseaux de Neurones. Hermès Editions, Paris
Shetty RP, De Bakker PI, DePristo MA, Blundell TL (2003) Advantages of fine-grained side chain conformer libraries. Protein Eng 16:963–969
Spears WM (1992) Adapting crossover in a genetic algorithm, technical report AIC-92–025, Navy Center for Applied Research in AI, http://www.aic.nrl.navy.mil/∼spears/papers/adapt.crossover.pdf
Spears WM (1994) Simple subpopulation schemes. In: Proceedings of the third annual conference on evolutionary programming, Evolutionary Programming Society, San Diego, pp 296–307
Spears WM, De Jong KA (1996) Analysing GAs using Markov models with semantically ordered and lumped states. In: Foundations of genetic algorithms 96, Morgan Kaufmann, San Fransisco, pp 95–100
Stein EG, Rice LM, Brunger AT (1997) Torsion-angle molecular dynamics as a new efficient tool for NMR structure calculation. J Magn Reson 124:154–164
Tai K (2004) Conformational sampling for the impatient. Biophys Chem 107:213–220
Teghem J (2003) Résolution de problèmes de RO par les métaheuristiques. Ed Hermès Sciences/Lavoisier, Paris
Vertanen K Genetic (1998) Adventures in parallel: towards a good island model under PVM. Oregon State University
Xia X, Maliski EG, Gallant P, Rogers D (2004) Classification of kinase inhibitors using a Bayesian model. J Med Chem 47:4463–4470
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Parent, B., Kökösy, A. & Horvath, D. Optimized Evolutionary Strategies in Conformational Sampling. Soft Comput 11, 63–79 (2007). https://doi.org/10.1007/s00500-006-0053-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-006-0053-y