Keywords

1 Introduction

Molecular Docking (MD) is a computer-aided approach used in Drug Discovery to predict the conformation of a small molecule (ligand) inside a larger molecule binding site (receptor), measuring the binding affinity between these molecules. Due to the massive number of possible conformations that a molecule can assume, the MD problem is considered as a NP-Hard one by computational complexity theory. Thus, the development of search methods with the capability to explore the conformational search space is essential. Due to the complexity related to the conformational search space, exact methods are not efficient since they can not retrieve a solution in a viable time. In this manner, metaheuristics, like Memetic Algorithms (MA), employing algorithms as global and local search procedures, became interesting approaches for finding good solutions in a possible computational time. Regardless of the possibility of using metaheuristics to explore the search space, they are very dependent on parameter setting and search mechanism definition. In this way, self-adaptive mechanisms are reliable techniques to give the algorithm the option to self-decide which parameter, or search mechanism, should be used along the optimization process [1].

Besides the search mechanism definition, another important component of MD methods is the energy function responsible for describing the interaction between receptor-ligand, evaluating different physicochemical aspects related to the binding process. Also, the scoring function is important for the search method, since it is used to distinguish and rank different solutions regarding energy. Different scoring functions are available in the literature [2], but the RosettaLigand scoring function had achieved interesting results in the last years [3, 4] in MD predictions. In light of these facts, we propose an MD method based on RosettaLigand scoring function. To explore the conformational search space, a novel self-adaptive multimeme algorithm based on Biased Random Key Genetic Algorithm (BRKGA) enhanced with four local-search algorithms is evaluated and compared with state-of-art methods. Our contributions are related to the self-adaptive mechanism used to select the local search strategy to be applied, and the radius perturbation for each docking case. Results obtained showed that our approach reaches better energy results in comparison with the non-adaptive BRKGA version and traditional methods such as AutoDock Vina [5], and jMetal [6]. The remaining of this paper is organized as follows: Sect. 2 describe the main concepts related to the molecular docking problem and related works. Section 3 presents the proposed self-adaptive approach. In Sect. 4 are analyzed the achieved results. Finally, Sect. 5 presents the conclusions and future works from this research.

2 Preliminaries

Molecule Flexibility: An important aspect related to the MD problem is the flexibility associated with each molecule. It is possible to classify the MD approaches in rigid-ligand and rigid-protein; flexible-ligand and rigid-protein; and flexible-ligand and flexible-protein. These approaches dictate how the problem is computationally encoded. Therefore, the molecule flexibility is directly related to the problem’s dimensionality, where more flexible molecules have more degrees of freedom. Thus, leading to a more complex optimization search space. For the flexible-ligand with rigid-receptor scheme (used in this work), a possible solution for the problem can be described by 7 + n variables, where three of them are the ligand translation values (T\(_{x}\), T\(_{y}\), and T\(_{z}\)), followed by four values corresponding the ligand’s orientation (Q\(_{x}\), Q\(_{y}\), Q\(_{z}\), and T\(_{w}\)), and n varies according to the dihedral angles present in the ligand.

Scoring Function: In order to compute the binding affinity between a ligand and a receptor, scoring functions are used to score and rank the steric and electrostatic interactions of both molecules. As the energy measurement is a computationally expensive operation [7], approximation methods have been used. It is possible to categorize these functions in force-field based, empirical and semi-empirical, and knowledge-based functions [8]. In this paper, we have used a knowledge-based energy function known as RosettaLigand, which considers statistical analysis of the Protein Data Bank (PDB) in its composition [3] in addition to physicochemical terms [4]. The RosettaLigand score function was tested in different comparative studies, showing good results when compared with well-known scoring functions such as AutoDock [3].

Related Works: The multidimensionality and complexity of the conformational search space avoid that any known computational technique optimally solves the MD problem. Many methods based on metaheuristics have been applied to attempt to get optimal solutions to these problems. The major of algorithms utilized in MD are Genetic Algorithms (GA), Differential Evolution (DE), and Particle Swarm Optimization (PSO). Hybrid strategies also have been applied, like in Rosin et al. [9] where was implemented a GA hybridized with Simulated Annealing (SA) and Solis-Wets as LS strategies to explore the binding conformational search space. In Tagle et al. [10] three variations of SA were employed as local search procedure in a Memetic Algorithm. Recently, Leonhart et al. [11] proposed a hybridization of BRKGA with three variations of Hill-climbing and SA algorithms working as LS chained operators applied in test cases of HIV-protease. The approach proposed by Krasnogor et al. [12] and applied to bioinformatics problems (prediction of structures) is also relevant to highlight. The authors developed a simple, but efficient, inheritance mechanism (SIM) for a discrete combinatorial search. The strategy consists of encoding the memetic material in the individual’s representation, where this material indicates the LS to be applied. During the evolution, the crossover is responsible for choosing what method attach to the offspring, according to the parent with better fitness. Jakob [13] proposed another self-adaptive approach to schedule local search methods over a function probability. In this idea, all LS have equal chance to be selected at the beginning of the process. During the evolutionary process, the probabilities of applying each algorithm are updated, according to the relative fitness gain and the required evaluations considered. In Domínguez-Isidro et al. [14] is proposed an adaptive local search coordination, based on a cost-benefit scheme, for a multimeme DE for constrained numerical optimization problems. Also, in Jin et al. [1] is proposed a MA, combining GA, DE, and Hill-climbing, working under a strategy that use weights to measure the contribution of the algorithms according to their improvement over the individuals of the population and in the stage of evolution. In this study, we investigate the possibility of applying a self-adaptive memetic algorithm, based in a proposal probability function and inspired on the BRKGA to coordinate the use of different local search methods, and exploring different neighborhoods by adapting the radius perturbation.

Memetic Algorithms: A Memetic Algorithm (MA) is an evolutionary algorithm which is composed of global and local procedures [15]. Global search algorithms can explore the whole search space, while, the exploitation of a neighborhood solution is attributed to a local search method, which can obtain good precision [16]. They are inspired by Darwinian’s principles of natural evolution and Dawkins’ notion of a meme, defined as a unit of cultural evolution that can perform local refinements.

Biased Random Key Genetic Algorithm: In this study, our global search strategy is inspired on the Biased Random Key Genetic Algorithm (BRKGA) initially proposed by Gonçalves et al. [17]. The algorithm combines aspects of GA and random keys to encode the solutions by real values. In our approach, however, we adopt real-coded values to represent each gene, instead of the keys. In BRKGA, the set of solutions is ordered by fitness value and divided in castes, called elite and non-elite groups. The initial population is randomly generated and the individuals ordered in the castes. The crossover operator gets one parent from each set to create the offspring following a probabilistic parameter, that prefers more genes from the elite group. The mutation procedure is responsible for generating new individuals instead of modifying them. Thus, the next population will be composed by the caste ‘A’ (elite set) from the current population; the caste ‘B’, formed by all offsprings; and the caste ‘C’, which are the mutated solutions. Then, the new set is ordered to update the individuals in the castes.

Search Space Discretization: The MD problem needs a definition of the search space around the binding site of the receptor. In this work, we adopt the idea proposed by Leonhart et al. [11] representing this area as a cube formed by smaller cubes, where the central atom of the ligand defines the center of this box. Thus, with this point and the area’s volume is possible to set the size of smaller cubes. This division has the objective to explore the whole ligand-receptor binding search space better. At the beginning of the memetic execution, the population is generated and equally distributed between the subareas of the cube. To keep the diversity, at least one individual must be in each area. This feature builds a local competition, where the better solutions dispute location between them. Procedures like crossover, mutation and local search operators could move solutions between regions, following the evolutionary process, but respecting the percentual of individuals non-migrants. The crossover can create offsprings from different positions of their parents according to the cube of each one. In mutation process, eventually, are filled empty cubes, and generated solutions in any area of the search space. Also, the local search methods can modify solutions of a subcube, depending on the approach to visit the neighborhood solutions.

Local Search Methods: Local search is a heuristic method which moves from solution to solution in the candidate’s space by performing local changes until getting a local optimum or reach another limit. However, this movement is only possible if a neighborhood relation is defined in the search space. The neighborhood generation is made taking into account three aspects: (i) order of gene visitation, i.e., which position would be first considered to explore; (ii) radius perturbation, what means how much each gene must be modified; and (iii) the direction of search, that indicates if the radius value would be added or subtracted from the value encoded in the individual.

The Hill Climbing algorithm (HC) [18], also known as descent improvement, is an old and simple local search that starts with an initial solution and replace it, at each iteration, by the best neighbor found that improves the value of objective function. The procedure ends when there is no solution better than the current one, reaching a local optimum. This algorithm presents variations in the way that the neighborhood is explored. The order of solution’s generation (deterministic or stochastic) is one of them, as well as the strategy selection of the next solution, known as a pivoting rule. We highlight three variants of HC (pivoting rules) to select the best neighbor: (i) Best Improvement (BI), which performs a fully and deterministic search in the whole neighborhood. This strategy could be time-consuming because it evaluates all the movements, although ensuring the selection of the best solution in every iteration. (ii) First Improvement (FI) this variation swap the current solution by the first best neighbor found. The neighbors are evaluated in a deterministic way following a pre-defined order of candidates’ generation. The approach is faster than BI because it does not visit the whole neighborhood, just in the worst case, meaning the end of improvements. (iii) Stochastic Hill Descent (SHD) is similar to FI. The swap rule of solutions is the same, but the order of neighbors’ visitation is randomized at each iteration of the process. This characteristic ensures an equal selection of candidates from all regions of the search space.

Another local search method is the Simulated Annealing (SA), a stochastic algorithm which accepts, in some moments, worst solutions [19]. The objective is to avoid local optimums and delay the search convergence. The process starts with an initial solution and at each iteration generates a random neighbor. The candidate is accepted if better than the current solution, and if it is worst, there is a rule with a decreasing probability to allow this solution. Thus, the procedure begins running like a random walk accepting many solutions, but when the probability decreases, the method is more similar to the HC algorithm.

A recent feature in LS is the concept of Local Search Chains proposed by Molina et al. [20]. The idea is to share the state of search between different local search’s applications. The LS call may not explore all neighborhood of a solution, so the final strategy parameter values achieved by the candidate will be the initial state of a possible subsequent local search application in this individual. This strategy allows the LS operators to be extended in some promising search zones, avoiding different algorithms to evaluate the same candidates by chaining the searches. According to [20] there are some aspects to consider in the LS chains’ management. A fixed intensity search is one of them, LS intensity stretch (\(I_{str}\)), which ensures that every local search algorithm has the same computational effort applied. Another one is to save the configurations (visitation order) that guided the search and the current state at the end of the call. In the HC methods, for instance, the last neighbor generated is saved in all variants, in the FI and SHD approaches are also kept the visitation’s order of the neighborhood.

Adaptive Memetic Algorithms (MAs): Recent studies have applied hybridization with adaptation in MAs. The adjustment of parameters and operators has presented a promising area of computation in this evolutionary algorithm. This approach can self-adjust to a given problem without previous knowledge by utilizing acquired information to adapt itself with the search progress. The challenge to design a robust and efficient MA has some questions to be considered, like: (i) where and when the local search should be applied; (ii) which individuals should be improved and how must be chosen; (iii) the computational effort required in each LS call; (iv) the equilibrium of ratio between global and local search. Several adaptive memetic algorithms have addressed these questions [1, 12, 13].

Multimeme Memetic Algorithm (MMA): The Multimeme Memetic Algorithm (MMA) was originally proposed by Krasnogor and Smith [21]. In a MA only one LS method is used, while in the MMA a set of local searchers is employed. The idea of the algorithm is self-adaptively choose from this set which method to use for different instances, phases of the search or individuals in the population. In their approach, the individual was represented by its genetic and memetic material, where the last one specifies which meme will be used to perform local search.

3 The Proposed Method

In this paper, we propose and test a MMA algorithm for the MD problem. Our algorithm self-adaptively choose which local search operators to apply in different individuals. We represent a solution only by its genetic material, i.e., encoding values to perform conformational operations with the ligand structure, which includes the translation and rotation of the molecule, and internal angle rotations. Thus, from the definition of our solution representation, we have a neighborhood size equal to 2n, where n represents the genes’ number in an individual. We have adapted and implemented the three variations of the HC algorithm and the Simulated Annealing (for more details see Sect. 2). Also, we incorporate the LS chaining to enjoy best the benefits of each algorithm applied. The main contribution of this study is the self-adaptive coordination of the local search methods and the radius perturbation, which works under a probabilistic rule composed of two main elements, inspired in the studies of Jakob [13, 22] and Domínguez-Isidro et al. [14]. Equation 1 show these two terms: the ratio fitness gain (rfg), and the ratio of success application (\(rs_{app}\)). The first element it is a measurement of how much each LS operator improves the fitness of solutions, considering the history of values reached and the effort applied in each one. The idea is to ponder LS operators (LSO), performing few iterations and considering improvements in the individuals. The second term is simple and represents the benefit of the LS, by counting how many times solutions were improved.

$$\begin{aligned} \begin{aligned} rfg = \left| \frac{f_{original}-f_{LS}}{f_{original}-f_{mean}}\right| \quad \quad rs_{app} = \frac{n_{improvements}}{n_{app}} \end{aligned} \end{aligned}$$
(1)

The rfg is obtained from the LS application to one individual, where \(f_{original}\) is the individual fitness before the local improvement, \(f_{LS}\) is the fitness reached by the algorithm, and \(f_{mean}\) is the fitness average from all solutions of the caste ‘A’ in the moment of LS call. The objective is to measure the effectiveness of the local operator over a solution comparing the improvement with the best individuals from the population at that moment. The \(rs_{app}\) is the ratio between the number of improvements (\(n_{improvements}\)) and the number of applications (\(n_{app}\)) of a given LSO. To keep the methods competitive and cooperative, inspired in the study of Jin et al. [1], we adopt weights to measure the contribution of each algorithm, the Eq. 2 shows the proposal:

$$\begin{aligned} \begin{aligned} weight_{rfg} = \bigg (\frac{\sum _{i=1}^{n_{app}}rfg_{i}}{n_{app}}\bigg )*w_{rfg}&\quad \quad weight_{rs_{app}} = rs_{app}*w_{rs_{app}} \\&weight_{X} = weight_{rfg}*weight_{rs_{app}} \end{aligned} \end{aligned}$$
(2)

To measure the effectiveness of a LS we consider the history of improvements by calculating the average rfg and multiplying by a fixed value, represented by \(w_{rfg}\). In the same way, at the success ratio is attributed another weight (\(weight_{rs_{app}}\)). These weight values added is equal to zero. Finally, with the multiplication of both, we get \(weight_{X}\), where X is any LS operator. After a local search application, the respective weight is calculated, and then the probabilities of each one are obtained by Eq. 3 shows:

$$\begin{aligned} \begin{aligned} prob_{X} = \frac{weight_{X}}{\sum _{j=1}^{n_{LS}}weigth_{j}} \end{aligned} \end{aligned}$$
(3)

The probability of a given local search is obtained by calculating its contribution percentage in the weights of all LS methods (\(n_{LS}\)). Thus, each operator has a value between 0 and 1, allowing to mount a roulette wheel to sort and obtain an algorithm to be applied. It is important to highlight that at the beginning of the MMA execution all LSO has the same evaluations number to fairly calibrate the initial weights. Along the execution, the best method will get a higher probability, so having more opportunity to be applied. Algorithm 1 shows the pseudo-code of our approach. The input values P, \(P_{e}\), \(P_{m}\), n, \(I_{ls}\) represent the population size, the elite size, the mutant population size, the number of individual genes, and the minimum number of individuals to run a LS call, respectively. During the crossover and mutation processes, if the intensity stretch of global search is reached, the local search function is called. The individuals to be improved are selected from the most populated cubes (see space discretization in Sect. 2). After each LS application, its contribution is calculated and the probabilities adjusted for the next call. The execution’s output is the best solution found until satisfies the maximum of energy evaluations.

figure a

4 Experimental Results

To test our MA approach, we use a test set with 16 complexes containing receptor and ligand molecules. We perform the tests into the following stages: (i) comparison of the proposed MMA with other two self-adaptive algorithms considering a fixed radius perturbation; (ii) the MMA approach with three variations of radius search; (iii) finally, the self-adaptive algorithm was compared with other methods in the state-of-art.

Benchmark: The 16 selected structures are based on the HIV-protease receptor and was previously classified in [23]. The set was obtained from the PDB database and divided into four groups, following the ligand structure size. The PDB code and the range of crystallographic resolution in Ångstr\(\ddot{o}\)ms (Å) are: small (1AAQ, 1B6L, 1HEG, 1KZK, 1.09–2.50); medium (1B6J, 1HEF, 1K6P, 1MUI, 1.85–2.80); large (1HIV, 1HPX, 1VIK, 9HVP, 2.00–2.80) size inhibitors, as well as cyclic urea (1AJX, 1BV9, 1G2K, 1HVH, 1.80–2.00) inhibitors. PYMOL was used to remove molecules such as solvent, non interacting ions, and water from the target structures.

The AutoDockTools was used to add partial charges and hydrogens to structure, as well, to define the maximum number of torsional angles in the ligand. This maximum number of torsions was set to 10, but can be less according to the ligand structure/size, and their selection considers the angles that fewer move atoms, keeping freeze the ligand center [23]. After, the Open Babel tool [24] was used to convert and generate necessary files to be manipulated by our algorithm and then used by the PyRosetta to load the complexes and evaluate the energy score. As already mentioned, the search space was represented by a cube including the binding site of the receptor, the size of this box was set in 11 Å for each axis, and the grid spacing defined as 0.375 Å. From this definition, the initial ligand conformation and position in the cube are randomized for each algorithm run, to take no advantage of the known crystal structure.

Parametrization: Considering the random values for the ligand in each run is fair that every algorithm starts from the same point. Thus, for each instance in each run, the initial population and the ligand structure was the same. In BRKGA we adopted the recommended values to the parameters according to [17]. The population size is 150 individuals, where \(20\%\) is the elite group, \(30\%\) is the mutation set, and the remaining is crossover offsprings. The elite allele inheritance probability on the random choice is 0.5–0.7 to crossover operation and the percentual of individuals non-migrants is set to \(30\%\) also. The LS parameters were defined as 0.5 to radius perturbation, in the first experiments, after this value varied in 0.1 and 0.25 also. The intensity stretch was equal to 500, and the global/local search ratio equal to 0.5, following the values adopted in [20]. Finally, the MMA weights were defined as 0.6 to the historical ratio fitness gain and 0.4 to the ratio of LS success application. Since the use of a stochastic method, we execute 11 (first and second stages) and 31 (third stage) independent runs per instance with a stop criterion of 1, 000, 000 energy evaluations. So, we are acquiring statistical confidence in the present results.

Achieved Results: We divide our tests into three stages described as following. Firstly, we ran the MMA approach, working under our proposal probability function, and compared with another two simple self-adaptation methods. The SIM technique [21] is one of them, and the other one is a random selection of the LS operators at the moment to be used. In the SIM approach we start the population equally distributing the LS methods to each solution, and during the iterations, the crossover and mutation procedures changes this memetic material. The random approach just sorts a method when the local search must be applied. All algorithms were applied over half of the instances with 11 runs each one. In this first test, we are interested in verifying the behavior of the approach based on probabilities against the inheritance and random methods. It is important to note that the radius perturbation is fixed in 0.5 to every gene. This value was chosen empirically in preliminaries tests with the generation and evaluation of random individuals. We evaluated the results by its energy in kcal/mol and with the Root Mean Square Deviation (RMSD), which is a structural measure of the ligand get by the algorithm with the crystallographic experimental structure of the molecule. Table 1 presents the best energy found and the corresponding RMSD, as well as the average and standard deviation of each one, based on the best solutions generated in each configuration.

Table 1. Achieved results from the MMA, based on the probability function, with an inheritance (SIM) and random (RAND) approaches. Cells highlighted in gray shows the best solution obtained, and cells in blue, the best solutions average, for each instance.

The SIM method has a slight advantage over our proposal, considering just the best energies found, but looking for the average, we get the best values in half instances. Also, regarding RMSD, the probability approach gets the best solutions in five test cases, and about the average, the best results were divided between the algorithms. We applied a nonparametric test for multiple comparisons procedure, the Dunn test [25], to analyze the statistical significance in the achieved results. The test indicated that there is no difference between the approaches compared. In the second stage, we have applied our MMA algorithm shifting the radius perturbation of the LS with the values: 0.10, 0.25, and 0.50. The idea of this stage is to verify if different modifications in the genes might improve the potential of the applied local search method. Table 2 summarizes the achieved results over the same benchmark utilized in the first test. In the same way, we evaluate the energy and RMSD from the best solutions, as well as the averages and standard deviations. Results show that we have an almost equal distribution of the best values found between the three variations, comparing just the best energy reached or the average. Thus, we conclude that all variations in the radius search could improve the LS process, for example, a bigger value can be more adequated at the beginning of the search, while a smaller one would be better in the algorithm refinement stage.

Table 2. Achieved results from the MMA approach with three variations of the LS radius perturbation. Third and fourth columns contains the lowest energy (kcal/mol) and its RMSD values for each version. Cells shaded in gray highlight the best solution obtained, and cells shaded in blue, the best solutions average, for each instance.
Table 3. Comparison results of MMA with BRKGA, two MA methods, AUTODOCK VINA, DOCKTHOR, and jMETAL. The lowest energy (kcal/mol) and its RMSD \(\AA \) values are shown (best values highlighted in gray), as well as their respective averages and standard deviation (cells shaded in blue for the best values) for 31 runs of each algorithm.

Also, the application of test Dunn showed that only in three instances (1B6L, 1G2K and 1HEF) there is a statistical difference between the three variations of the MMA algorithm. With these preliminary results, we have decided to extend and apply our proposal probability function to the choice of which radius perturbation to use during the optimization process, combining it with the previous variation in the LS methods. In this third stage, the weights and probabilities of the radius pool are updated after a local search execution. Similarly, the application and evaluation of the LS algorithms are made with any radius perturbation applied. Thus, we compare the MMA approach against a BRKGA version without LS, and another two MA versions with SHD and SA. Additionally, we compare these results with AUTODOCK VINA [5], DOCKTHOR [26], and jMETAL [6], a multi-objective docking approach. Table 3 shows the achieved results. The comparison of energy values was made only between our implemented methods because of the energy function utilized, which differs from other methods. The RMSD comparison was made between all algorithms. The values show that the MMA is better than BRKGA, when we compare the average of energy and RMSD. Comparing the SHD and SA approaches with the self-adaptive, it is possible to notice that values are very similar. Considering the average RMSD values, the MMA got better results only in the 1AAQ complex in comparison with the state-of-art algorithms. The best method overall was the DOCKTHOR with better RMSD values in \(81\%\) of the instances used in this work.

In the same way, we applied the Dunn test in this stage. Table 4 shows the significant levels of energy and RMSD, adopting a significance of \(\alpha < 0.05\), guaranteeing a confidence of \(95\%\) in the analysis. Cells above the main diagonal (highlighted) shows the p-values comparing the energy results, and the remaining cells show the RMSD comparisons. Comparing MMA with BRKGA we find significant difference in all instances regarding energy and \(50\%\) in RMSD. Also, looking for the differences between the self-adaptive approach and SHD and SA, there are significant energy values only for complexes 9HVP and 1AAQ, respectively. Thus, the MMA shows to be better than a BRKGA method, but is equivalent with MA versions.

Table 4. Analysis of results with a significant level equal to \(p < 0.05\), cells above the main diagonal show the p-values for energy, and the entries below the RMSD values.

5 Conclusion and Future Work

The statistical test performed shows that the multimeme approach improves the results for the MD problem instead use only an evolutionary algorithm. Although, the comparisons with memetic implementations showed to be equivalent techniques. This is explained because in the MMA version is utilized, in a distributed way, four LS algorithms instead only one in the full search process. In this case, we may be losing computational effort with methods that do not contribute much to find local optimums. Also, in the MA is adopted a fixed radius whereas in the multimeme is utilized three values to be self-adapted. This feature is interesting because the methods can explore different neighborhoods in different stages of the evolution, and then compensate for a possible loss of processing. So, this pool of local search methods combined with the pool of radius perturbation confirms to be an interesting initial variation to reach reasonable solutions to this problem, but still, need improvements.

This work brings contributions to the use of a self-adaptive memetic computational technique. Also, the proposal and application of a function to evaluate, in execution, the local search impacts, to best adapt the parameters and guide the search process. Further investigations might consider different global search algorithms, e.g., Differential Evolution and Particle Swarm Optimization, as well as methods to run as local searches, such as Solis and Wets and Nelder-Mead. Besides that, the use of another objective function to evaluate the ligation energy, such as AutoDock Vina, it is interesting to confirm the performance of the algorithm. The results of minimum energy would be different but the ability of MMA will be the same.

About the self-adaptive approach is possible to change the pool of LSO, using only the SHD and SA methods, for instance. The values for radius perturbation can also be modified, as well as different values applied to each gene, i.e., the translation operation may have a specific value different of the rotation and dihedral angles. Also, in the our probability function there are issues to be explored since the weights utilized in both terms until other details. The option by considering the average fitness of caste ‘A’ can be replaced by the fitness value of the best solution only, or the average of all solutions, or the average of the best individuals from each subcube. Another possibility is to implement a time window to evaluate the LS contributions, considering the last one hundred thousand evaluations, for example. Finally, future works can explore many features to turn this approach better to find solutions to molecular docking problem.