Skip to main content
Log in

Using selfish gene theory to construct mutual information and entropy based clusters for bivariate optimizations

  • Focus
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

This paper proposes a new approach named SGMIEC in the field of estimation of distribution algorithm (EDA). While the current EDAs require much time in the statistical learning process as the relationships among the variables are too complicated, the selfish gene theory (SG) is deployed in this approach and a mutual information and entropy based cluster (MIEC) model with an incremental learning and resample scheme is also set to optimize the probability distribution of the virtual population. Experimental results on several benchmark problems demonstrate that, compared with BMDA, COMIT and MIMIC, SGMIEC often performs better in convergent reliability, convergent velocity and convergent process.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Ahn CW, Ramakrishna RS (2008) On the scalability of real-coded bayesian optimization algorithm. IEEE Trans Evol Comput 12(3):307–322

    Article  Google Scholar 

  • Baluja S (1994) Population-based incremental learning: a method for integrating genetic search based function optimization and competitive learning. Technical Report CMU-CS-94-163, Carnegie Mellon University, Pittsburgh

  • Baluja S, Davies S (1997) Using optimal dependency-trees for combinational optimization. In: ICML ’97: Proceedings of the fourteenth international conference on machine learning, San Francisco, pp 30–38

  • Baluja S, Davies S (1998) Fast probabilistic modeling for combinatorial optimization. In: Proceedings of 15th national conference on artificial intelligence (AAAI), pp 469–476

  • Bonet J, Isbell CL, Viola P (1997) Mimic: finding optima by estimating probability densities. In: Advances in neural information processing systems, vol 9. MIT Press, Cambridge, pp 424–430

  • Corno F, Reorda MS, Squillero G (1998a) A new evolutionary algorithm inspired by the selfish gene theory. In ICEC’98: IEEE international conference on evolutionary computation, pp 575–580

  • Corno F, Reorda M, Squillero G (1998b) The selfish gene algorithm: a new evolutionary optimization strategy. In SAC98: 13th annual ACM symposium on applied computing, Atlanta, pp 349–355

  • Cover TM, Thomas JA (2006) Elements of information theory, 2nd edn. Wiley series in telecommunications and signal processing. Wiley, New York

  • Dawkins R (1989) The selfish gene—new edition. Oxford University Press, Oxford

  • Harik G (1999) Linkage learning via probabilistic modeling in the ecga. Technical report, University of Illinois at Urbana-Champaign

  • Harik GR, Lobo FG, Goldberg DE (1999) The compact genetic algorithm. IEEE Trans Evol Comput 3(4):287–297

    Article  Google Scholar 

  • Harik GR, Lobo FG, Sastry K (2006) Linkage learning via probabilistic modeling in the extended compact genetic algorithm(ecga). In: Scalable optimization via probabilistic modeling, pp 39–61

  • Hong Y, Kwong S, Wang H, Xie ZH, Ren Q (2008) Svpcga: Selection on virtual population based compact genetic algorithm. In: IEEE Congress on evolutionary computation, pp 265–272

  • Larranaga P, Lozano J (2002) Estimation of distribution algorithms: a new tool for evolutionary computation. Kluwer, Boston

    MATH  Google Scholar 

  • Muhlenbein H, Paass G (1996) From recombination of genes to the estimation of distributions i. binary parameters. In PPSN IV: Proceedings of the 4th international conference on parallel problem solving from nature, London, pp 178–187

  • Pelikan M, Muhlenbein H (1998) Marginal distribution in evolutionary algorithms. In In Proceedings of the international conference on genetic algorithms Mendel’98, pp 90–95

  • Pelikan M, Muhlenbein H (1999) The bivariate marginal distribution algorithm. In: Advances in soft computing: engineering design and manufacturing. Springer, London, pp 521–535

  • Pelikan M, Goldberg DE, Cantu-Paz E (1999) Boa: the bayesian optimization algorithm. In Proceedings of the genetic and evolutionary computation conference (GECCO-99). Morgan Kaufmann, pp 525–532

  • Yang S, Yao X (2008) Population-based incremental learning with associative memory for dynamic environments. IEEE Trans Evol Comput 12(5):542–561

    Article  Google Scholar 

  • Yang SY, Ho SL, Ni GZ, Machado JM, Wong KF (2007) A new implementation of population based incremental learning method for optimizations in electromagnetics. IEEE Trans Mag 43(4):1601–1604

    Article  Google Scholar 

  • Yu T-L, Goldberg DE (2004) Dependency structure matrix analysis: offline utility of the dependency structure matrix genetic algorithm. In GECCO (2), pp 355–366

  • Yu T-L, Sastry K, Goldberg DE, Pelikan M (2007) Population sizing for entropy-based model building in discrete estimation of distribution algorithms. In GECCO, pp 601–608

  • Zhang Q, Zhou A, Jin Y (2008) Rm-meda: A regularity model-based multiobjective estimation of distribution algorithm. IEEE Trans Evol Comput 12(1):41–63

    Article  Google Scholar 

  • Zhou SD, Sun ZQ (2007) A survey on estimation of distribution algorithms. Acta Autom Sin 33(2):113–124

    Article  MATH  MathSciNet  Google Scholar 

  • Zhou A, Zhang Q, Jin Y, Sendhoff B (2008) Combination of eda and de for continuous biobjective optimization. In: IEEE congress on evolutionary computation, pp 1447–1454

Download references

Acknowledgments

This work was supported by the Research Project of Wuhan University under Grant 6082018.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Feng Wang.

Appendix

Appendix

The first benchmark problem is the One-Max:

$$ \max \left( \sum\limits_{i = 1}^n {x_i } \right)\quad x_i \in \left\{0,1 \right\} $$

The second benchmark problem is the Weighed One-Max:

$$ \max \left( {\sum\limits_{i = 1}^n {i x_i } } \right)\quad x_i \in \left\{ 0,1 \right\} $$

The third benchmark problem is the four peaks:

Given an N-dimensional input vector X , the four peaks evaluation function is defined as:

$$ f\left({\mathbf{X}},T \right) = \max \left[ {\hbox{tail}\left( 0,{\mathbf{X}} \right), \hbox{head}\left( 1,{\mathbf{X}} \right)} \right] + R\left( {\mathbf{X}},T \right) $$

where

$$\begin{aligned} &\hbox{tail}( b,{\mathbf{X}}) = \hbox{number\;of\;trailing\;b's\;in}\;{\mathbf{X}}\\& \hbox{head} ( b,{\mathbf{X}}) = \hbox{number\;of\;trailing\;b's\;in}\;{\mathbf{X}}\\ & R ({\mathbf{X}},T) = \left\{ \begin{array}{ll} N & \hbox{if}\;\hbox{tail}(0,{\mathbf{X}}) > T\;\hbox{and}\;\hbox{head}( 1,{\mathbf{X}}) >T \\ 0 & \\ \end{array} \right. \end{aligned}$$

In all trails, T was set to be 10% of N, the size of the problem.

The fourth benchmark problem is the trap problem: The general k-bit trap functions are defined as

$$ F_k ( {b_1,\ldots,b_k }) = \left\{ \begin{array}{ll} f_{\rm high}\hfill & \hbox{if}\ u = k \hfill \\ f_{\rm low}-(u \times f_{\rm low})/ /(k - 1)\hfill & \hbox{otherwise} \hfill \\ \end{array} \right. $$

where b i is in 0,1,\( u = \sum\nolimits_{i = 1}^k {b{}_i} \) and f highf low.Usually, f high is set at k and f low is set at k − 1. The trap functions denoted by \( F_{m {\times}k} \) are defined as

$$ F_{m \times k} \left( {k_1,\ldots,k_m } \right) = \sum\limits_{i = 1}^m{F_k \left( {k_i } \right)} ,k_i \in \left\{ 0,1 \right\}^k $$

The m and k are varied to produce a number of test functions. In all trails, k was set to be 5.

The fifth benchmark problem is the Satisfaction problem:

$$ \max \left({\sum\limits_{i = 1}^n {f\left({x_{5i - 4} ,x_{5i - 3} ,x_{5i - 2}, x_{5i - 1}, x_{5i} } \right)}}\right)\;\;x_i \in \left\{ 0,1 \right\} $$

where f(x 5i−4x 5i−3x 5i−2x 5i−1x 5i ) equals to 5 if and only if all variables equal to 1. Otherwise it equals to 0.

The six benchmark problem is the deceptive-3 problem:

$$ \max \sum\limits_{i = 1}^{\frac{n}{3}}{f\left( {x_{3i - 2} ,x_{3i - 1} ,x_{3i}} \right)} \quad x_i \in \left\{ 0,1 \right\} $$

where

$$ u = x_{3i - 2} + x_{3i - 1} + x_{3i} $$

and

$$ f(u) = \left\{ \begin{array}{ll} 0.9& \hbox{if} \quad u = 0; \\ 0.8& \hbox{if} \quad u = 1; \\ 0.0& \hbox{if} \quad u = 2; \\ 1.0 & \hbox{otherwise}; \\ \end{array} \right. $$

This problem is a hard deceptive problem which has a large number of local optimal solutions.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, F., Lin, Z., Yang, C. et al. Using selfish gene theory to construct mutual information and entropy based clusters for bivariate optimizations. Soft Comput 15, 907–915 (2011). https://doi.org/10.1007/s00500-010-0557-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-010-0557-3

Keywords

Navigation