skip to main content
10.1145/3319619.3326899acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
research-article

Towards better estimation of statistical significance when comparing evolutionary algorithms

Published:13 July 2019Publication History

ABSTRACT

The use of well-established statistical testing procedures to compare the performance of evolutionary algorithms often yields pessimistic results. This requires increasing the number of independent samples, and thus the computation time, in order to get results with the necessary precision.

We aim at improving this situation by developing statistical tests that are good in answering typical questions coming from benchmarking of evolutionary algorithms. Our first step, presented in this paper, is a procedure that determines whether the performance distributions of two given algorithms are identical for each of the benchmarks. Our experimental study shows that this procedure is able to spot very small differences in the performance of algorithms while requiring computational budgets which are by an order of magnitude smaller (e.g. 15x) compared to the existing approaches.

References

  1. 2015. Bayesian statistics. Nature Methods 12 (2015), 377--378.Google ScholarGoogle ScholarCross RefCross Ref
  2. William Jay Conover. 1999. Practical Nonparametric Statistics (3rd ed.). Wiley.Google ScholarGoogle Scholar
  3. Axel de Perthuis de Laillevault, Benjamin Doerr, and Carola Doerr. 2015. Money for Nothing: Speeding Up Evolutionary Algorithms Through Better Initialization. In Proceedings of Genetic and Evolutionary Computation Conference. 815--822. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Joaquin Derrac, Salvador Garcia, Daniel Molina, and Francisco Herrera. 2011. A Practical Tutorial on the Use of Nonparametric Statistical Tests as a Methodology for Comparing Evolutionary and Swarm Intelligence Algorithms. Swarm and Evolutionary Computation 1, 1 (2011), 3--18.Google ScholarGoogle ScholarCross RefCross Ref
  5. Benjamin Doerr and Carola Doerr. 2016. The Impact of Random Initialization on the Runtime of Randomized Search Heuristics. Algorithmica 75, 3 (2016), 529--553. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Olive Jean Dunn. 1961. Multiple Comparisons Among Means. J. Amer. Statist. Assoc. 56, 293 (1961), 52--64.Google ScholarGoogle ScholarCross RefCross Ref
  7. Milton Friedman. 1940. A comparison of alternative tests of significance for the problem of m rankings. The Annals of Mathematical Statistics 11, 1 (1940), 86--92.Google ScholarGoogle ScholarCross RefCross Ref
  8. Yosef Hochberg. 1988. A Sharper Bonferroni Procedure for Multiple Tests of Significance. Biometrika 75, 4 (1988), 800--802.Google ScholarGoogle ScholarCross RefCross Ref
  9. Myles Hollander, Douglas A. Wolfe, and Eric Chicken. 2007. Nonparametric Statistical Methods (3rd ed.). Wiley.Google ScholarGoogle Scholar
  10. Andrey Kolmogorov. 1933. Sulla determinazione empirica di una legge di distribuzione. Giornale dell'Istituto Italiano degli Attuari 4 (1933), 83--91.Google ScholarGoogle Scholar
  11. William H. Kruskal and W. Allen Wallis. 1952. Use of ranks in one-criterion variance analysis. J. Amer. Statist. Assoc. 47 (1952), 583--621.Google ScholarGoogle ScholarCross RefCross Ref
  12. Henry B. Mann and Donald R. Whitney. 1947. On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other. Annals of Mathematical Statistics 18, 1 (1947), 50--60.Google ScholarGoogle ScholarCross RefCross Ref
  13. R Core Team. 2013. R: A Language and Environment for Statistical Computing. http://www.R-project.org/. http://www.R-project.org/Google ScholarGoogle Scholar
  14. John A. Rice. 2007. Mathematical Statistics and Data Analysis (3rd ed.). Cengage Learning.Google ScholarGoogle Scholar
  15. Nikolai Smirnov. 1948. Table for estimating the goodness of fit of empirical distributions. Annals of Mathematical Statistics 19, 2 (1948), 279--281.Google ScholarGoogle ScholarCross RefCross Ref
  16. Frank Wilcoxon. 1945. Individual comparisons by ranking methods. Biometrics Bulletin 1, 6 (1945), 80--83.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Towards better estimation of statistical significance when comparing evolutionary algorithms

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          GECCO '19: Proceedings of the Genetic and Evolutionary Computation Conference Companion
          July 2019
          2161 pages
          ISBN:9781450367486
          DOI:10.1145/3319619

          Copyright © 2019 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 13 July 2019

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate1,669of4,410submissions,38%

          Upcoming Conference

          GECCO '24
          Genetic and Evolutionary Computation Conference
          July 14 - 18, 2024
          Melbourne , VIC , Australia

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader