How Many Times Should One Run a Computational Simulation?

Seri, Raffaello; Secchi, Davide

doi:10.1007/978-3-319-66948-9_11

Raffaello Seri²² &
Davide Secchi²³

Part of the book series: Understanding Complex Systems ((UCS))

2671 Accesses
3 Altmetric

Abstract

This chapter is an attempt to answer the question “how many runs of a computational simulation should one do,” and it gives an answer by means of statistical analysis. After defining the nature of the problem and which types of simulation are mostly affected by it, the article introduces statistical power analysis as a way to determine the appropriate number of runs. Two examples are then produced using results from an agent-based model. The reader is then guided through the application of this statistical technique and exposed to its limits and potentials.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

MultiVeStA: Statistical Analysis of Economic Agent-Based Models by Statistical Model Checking

Multiple Runs in the Simulation of Stochastic Systems Can Improve the Estimates of Equilibrium Expectations

Decision Making in Agent-Based Models

Notes

1.
Note, moreover, that the researcher should not test a hypothesis on the data that have been used to generate it.
2.
The metaphor of the trial has been introduced in Neyman and Pearson (1933, p. 296) but has been criticized as misleading in Liu and Stone (2007).
3.
Here ∈ means “belongs to,” so that $T\in \mathcal {A}$ means “T belongs to $\mathcal {A}$.”
4.
We note that Neyman (1950, p. 259) used the term “accept” where most modern treatments propose to use “fail to reject” or “do not reject.” The original choice of the author is in line with his idea of testing as leading to decision, while the modern use appears to be incorrectly borrowed from Fisher’s approach (Fisher 1955, p. 73). However Pearson was more cautious (Pearson 1955, p. 206) and this even suggested to some authors the idea that he had rejected the approach pioneered with Neyman (Mayo 1992).
5.
More generally, the effect size d measures the distance of the true distribution from the distribution under the null hypothesis, and is generally a function of the parameters.
6.
This also explains why in some cases it is possible to increase the power of a test by designing an experiment in which it is expected that the effect size d, if not null, is large. As an example, in ABM this could be done by setting some of the quantities entering the model to their extreme values.
7.
See also van der Vaart (2000, p. 213) or Choirat and Seri (2012, Proposition 7, p. 285).
8.
The authors say: “The use of these statistical tools in any given case, in determining just how the balance should be struck, must be left to the investigator” (Neyman and Pearson 1933, p. 296).
9.
The number of citations of the original paper (Cohen et al. 1972) in Google Scholar amounts at 9196 and those from Thomson’s Web of Science are 1864.
10.
Even though we use this method for ABM, it may reveal to be useful for any simulation with emergent properties derived from a relevant stochastic component.
11.
In an interesting exchange with Bruce Edmonds, we came to realize that this approach might raise some important issues. One of the concerns is that thresholds do not usually adjust because the experiment is so well planned that results come out to be extremely clear; that is to say that good experimental work still accepts or rejects hypotheses at the level α < 0.05 with 1 − β ≈ 0.80. This implies that adjustments of these levels for simulation work appears to be arbitrary. Our position on this critique is that thresholds actually change as it happens in some medical studies, where 1 − β raises to 0.90 (Lakatos 2005), or when we listen to the calls not to interpret the traditional choices of α levels as absolute from either social scientists (Gigerenzer 2004) or statisticians (Wasserstein and Lazar 2016). While a complete review of the reasons leading to the traditional choices of α and β is in Secchi and Seri (2017), the introduction to testing theory above should have made clear that the fathers of this theory thought of α and β as quantities to be chosen according to the problem at hand. This justifies our proposals as long as we cannot compare artificial computational experiment to real-life experiments because of different variability of observations, observer’s control and role, and the usual difficulty of increasing sample size for empirical experiments.
12.
This formula can be used in R with an ad hoc function taken from one of our previous publications (Secchi and Seri 2017). See the Appendix for the code for both formulas.
13.
A possibility is to choose, as SESOI, the lower bound of a confidence interval on the effect size with a specified confidence probability, e.g., 0.95 or 0.90.
14.
See the Appendix for details on how the effect size of the ANOVA and OLS regressions map onto each other.
15.
Over-power reduces β well below the chosen value of α. This is a problem because Type-I errors are generally perceived as more serious than Type-II errors, and when β ≪ α we expect exactly a higher incidence of serious errors and a lower incidence of less serious ones. That is the reason why, at least in the intentions of Neyman and Pearson, α and β should have been chosen in a balanced way.

References

Anderson, P. (1972). More is different. Science, 177(4047), 393–396.
Google Scholar
Bardone, E. (2016). Intervening via chance-seeking. In D. Secchi & M. Neumann (Eds.), Agent-based simulation of organizational behavior. New frontiers of social science research (pp. 203–220). New York: Springer.
Google Scholar
Bland, J. M. (2009). The tyranny of power: Is there a better way to calculate sample size? BMJ, 339, b3985.
Google Scholar
Champely, S., Ekstrom, C., Dalgaard, P., Gill, J., Weibelzahl, S., & Rosario, H. D. (2016). Pwr: Basic functions for power analysis.
Google Scholar
Choirat, C., & Seri, R. (2012). Estimation in discrete parameter models. Statistical Science, 27(2), 278–293.
Google Scholar
Coen, C. (2009). Simple but not simpler. Introduction CMOT special issue–simple or realistic. Computational and Mathematical Organization Theory, 15, 1–4.
Google Scholar
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale: LEA.
Google Scholar
Cohen, J. (1992). A power primer. Psychological Bulletin, 112, 155–159.
Google Scholar
Cohen, M. D., March, J. G., & Olsen, H. P. (1972). A garbage can model of organizational choice. Administrative Science Quarterly, 17(1), 1–25.
Google Scholar
Davidsson, P., & Verhagen, H. (2017). Types of simulation. doi: https://doi.org/10.1007/978-3-319-66948-9_3.
de Marchi, S., & Page, S. E. (2014). Agent-based models. Annual Review of Political Science, 17(1), 1–20.
Google Scholar
Edmonds, B., & Meyer, R. (2017). Introduction to the handbook. doi: https://doi.org/10.1007/978-3-319-66948-9_1.
Edmonds, B., & Moss, S. (2005). From KISS to KIDS — an ‘anti-simplistic’ modelling approach. In P. Davidson (Ed.), Multi agent based simulation. Lecture Notes in Artificial Intelligence (Vol. 3415, pp. 130–144). New York: Springer.
Google Scholar
Erdfelder, E. (1984). Zur Bedeutung und Kontrolle des β-Fehlers bei der inferenzstatistischen Prüfung log-linearer Modelle [The significance and control of the β-error during the inference-statistical examination of the log-linear models]. Zeitschrift für Sozialpsychologie, 15(1), 18–32.
Google Scholar
Fioretti, G. (2016). Emergent organizations. In D. Secchi & M. Neumann (Eds.), Agent-based simulation of organizational behavior. New frontiers of social science research (pp. 19–41). New York: Springer.
Google Scholar
Fioretti, G., & Lomi, A. (2008). An agent-based representation of the garbage can model of organizational choice. Journal of Artificial Societies and Social Simulation, 11(1).
Google Scholar
Fioretti, G., & Lomi, A. (2010). Passing the buck in the garbage can model of organizational choice. Computational and Mathematical Organization Theory, 16(2), 113–143
Article Google Scholar
Fisher, R. (1955). Statistical methods and scientific induction. Journal of the Royal Statistical Society. Series B (Methodological), 17(1), 69–78
MathSciNet MATH Google Scholar
Gigerenzer, G. (2004). Mindless statistics. Journal of Socio-Economics, 33, 587–606.
Article Google Scholar
Gilbert, N., & Terna, P. (2000). How to build and use agent-based models in social science. Mind and Society, 1, 57–72.
Article Google Scholar
Hahn, G. J., & Meeker, W. Q. (2011). Statistical intervals: A guide for practitioners. Hoboken: Wiley.
MATH Google Scholar
Heckbert, S. (2013). MayaSim: An agent-based model of the ancient Maya social-ecological system. Journal of Artificial Societies and Social Simulation, 16(4), 11.
Article Google Scholar
Herath, D., Secchi, D., & Homberg, F. (2015). Simulating the effects of disorganisation on employee goal setting and task performance. In D. Secchi & M. Neumann (Eds.), Agent-based simulation of organizational behavior. New frontiers of social science research (pp. 63–84). New York: Springer.
Google Scholar
Herath, D., Costello, J., & Homberg, F. (2017). Team problem solving and motivation under disorganization – an agent-based modeling approach. Team Performance Management, 23(1/2), 46–65.
Article Google Scholar
Hoenig, J. M., & Heisey, D. M. (2001). The abuse of power. The American Statistician, 55(1), 19–24.
Article MathSciNet Google Scholar
Kollman, K., Miller, J. H., & Page, S. E. (1992). Adaptive parties in spatial elections. The American Political Science Review, 86(4), 929–937.
Article Google Scholar
Korn, E. L. (1990). Projecting power from a previous study: Maximum likelihood estimation. The American Statistician, 44(4), 290–292.
Google Scholar
Lakatos, E. (2005). Sample size determination for clinical trials. In Encyclopedia of biostatistics. Hoboken: Wiley.
Google Scholar
Lakens, D. (2014). Performing high-powered studies efficiently with sequential analyses. European Journal of Social Psychology, 44(7), 701–710.
Article Google Scholar
Lakens, D. & Evers, E. R. K. (2014). Sailing from the seas of chaos into the corridor of stability practical recommendations to increase the informational value of studies. Perspectives on Psychological Science, 9(3), 278–292.
Article Google Scholar
Lamperti, F. (2015). An Information Theoretic Criterion for Empirical Validation of Time Series Models. LEM Papers Series 2015/02, Laboratory of Economics and Management (LEM), Sant’Anna School of Advanced Studies, Pisa, Italy.
Google Scholar
Liu, X. S. (2014). Statistical power analysis for the social and behavioral sciences. New York: Routledge.
MATH Google Scholar
Liu, T., & Stone, C. C. (2007). Law and statistical disorder: Statistical hypothesis test procedures and the criminal trial analogy. SSRN Scholarly Paper ID 887964, Social Science Research Network, Rochester, NY.
Google Scholar
Maggi, E., & Vallino, E. (2016). Understanding urban mobility and the impact of public policies: The role of the agent-based models. Research in Transportation Economics, 55, 50–59.
Article Google Scholar
Maxwell, S. E., Kelley, K., & Rausch, J. R. (2008). Sample size planning for statistical power and accuracy in parameter estimation. Annual Review of Psychology, 59(1), 537–563.
Article Google Scholar
Mayo, D. G. (1992). Did pearson reject the neyman-pearson philosophy of statistics? Synthese, 90(2), 233–262.
Article MathSciNet MATH Google Scholar
Mungovan, D., Howley, E., & Duggan, J. (2011). The influence of random interactions and decision heuristics on norm evolution in social networks. Computational and Mathematical Organization Theory, 17(2), 152–178.
Article Google Scholar
Neyman, J. (1950). First course in probability and statistics. New York: Henry Holt and Company.
MATH Google Scholar
Neyman, J., & Pearson, E. S. (1928). On the use and interpretation of certain test criteria for purposes of statistical inference: Part I. Biometrika, 20A(1/2), 175–240.
Article MATH Google Scholar
Neyman, J., & Pearson, E. S. (1933). On the problem of the most efficient tests of statistical hypotheses. Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character, 231, 289–337.
MATH Google Scholar
Pearson, E. S. (1955). Statistical concepts in the relation to reality. Journal of the Royal Statistical Society. Series B (Methodological), 17(2), 204–207.
MathSciNet MATH Google Scholar
Railsback, S. F., & Grimm, V. (2011). Agent-based and individual-based modeling: A practical introduction (59468th ed.). Princeton: Princeton University Press.
MATH Google Scholar
Ritter, F. E., Schoelles, M. J., Quigley, K. S., & Cousino-Klein, L. (2011). Determining the numbers of simulation runs: Treating simulations as theories by not sampling their behavior. In L. Rothrock & S. Narayanan (Eds.), Human-in-the-loop simulations: Methods and practice (pp. 97–116). London: Springer.
Chapter Google Scholar
Robinson, S. (2014). Simulation. The practice of model development and use (2nd ed.). New York: Palgrave.
Google Scholar
Schönbrodt, F. D., & Perugini, M. (2013). At what sample size do correlations stabilize? Journal of Research in Personality, 47(5), 609–612.
Article Google Scholar
Secchi, D. (2015). A case for agent-based model in organizational behavior and team research. Team Performance Management, 21(1/2), 37–50.
Article Google Scholar
Secchi, D., & Gullekson, N. (2016). Individual and organizational conditions for the emergence and evolution of bandwagons. Computational and Mathematical Organization Theory, 22(1), 88–133.
Article Google Scholar
Secchi, D., & Seri, R. (2014). ‘How many times should my simulation run?’ Power analysis for agent-based modeling. In European Academy of Management Annual Conference, Valencia, Spain.
Google Scholar
Secchi, D., & Seri, R. (2017). Controlling for ‘false negatives’ in agent-based models: A review of power analysis in organizational research. Computational and Mathematical Organization Theory, 23(1), 94–121.
Article Google Scholar
Shimazoe, J., & Burton, R. M. (2013). Justification shift and uncertainty: Why are low-probability near misses underrated against organizational routines? Computational and Mathematical Organization Theory, 19(1), 78–100.
Article Google Scholar
Simon, H. A. (1976). How complex are complex systems. In PSA: Proceedings of the Biennial Meeting of the Philosophy of Science Association (Vol. 2, pp. 507–522). Baltimore: Philosophy of Science Association.
Google Scholar
Simon, H. A. (1978). Rationality as process and a product of thought. American Economic Review, 68, 1–14.
Google Scholar
Simon, H. A. (1997). Administrative behavior (4th ed.). New York: The Free Press.
Google Scholar
Thiele, J., Kurth, W., & Grimm, V. (2015). Facilitating parameter estimation and sensitivity analysis of agent-based models: A cookbook using NetLogo and R. Journal of Artificial Societies and Social Simulation, 17(3), 11.
Article Google Scholar
Thomsen, S. E. (2016). How docility impacts team efficiency. An agent-based modeling approach. In D. Secchi & M. Neumann (Eds.), Agent-based simulation of organizational behavior. New frontiers of social science research (pp. 159–173). New York: Springer.
Chapter Google Scholar
Troitzsch, K. G. (2017). Historical introduction. doi: https://doi.org/10.1007/978-3-319-66948-9_2.
van der Vaart, A. W. (2000). Asymptotic statistics. Cambridge: Cambridge University Press.
MATH Google Scholar
Wasserstein, R. L., & Lazar, N. A. (2016). The ASA’s statement on p-values: Context, process, and purpose. American Statistician, 70(2), 129–133.
Article MathSciNet Google Scholar
Wilensky, U. (1999). Netlogo. Center for Connected Learning and Computer-Based Modeling, Northwestern University, Evanston, IL.
Google Scholar

Download references

Author information

Authors and Affiliations

University of Insubria, Varese, Italy
Raffaello Seri
University of Southern Denmark, Slagelse, Denmark
Davide Secchi

Authors

Raffaello Seri
View author publications
You can also search for this author in PubMed Google Scholar
Davide Secchi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Davide Secchi .

Editor information

Editors and Affiliations

Centre for Policy Modelling, Manchester Metropolitan University, Business School, Manchester, United Kingdom
Bruce Edmonds
Centre for Policy Modelling, Manchester Metropolitan University, Business School, Manchester, United Kingdom
Ruth Meyer

Appendices

Appendix

11.2.1 Number of Runs Calculations

The following is the R code for a function that calculates the number of runs for the configuration of parameters (G, here G) and effect size (f, here ES), given 1 − β = 0.95, α = 0.01:

n.runs <- function(G, ES) { return(14.091 ⋆ Ĝ(-0.640) ⋆ EŜ(-1.986)) }

In the case discussed in Exercise 1 above, the numbers are:

n.runs(3, 0.25) [1] 109.465

The same analysis using the exact function of the package pwr on power analysis (see Champely et al. 2016) is:

pwr.anova.test(f=0.25, k=3, power=0.95, sig.level=0.01)

and yields n = 111.677.

11.2.2 Effect Size for ANOVA vs OLS Regression

In the text we have used a one-way ANOVA test to estimate the number of runs, taking 1 − β = 0.95, α = 0.01 and a given effect size f. However, we then used regression analysis to study the differences between under-, correctly-, and over-powered models.

Since there is transformation between the parameters of ANOVA and OLS regression, it is possible to connect the way effect size is calculated in the first to the second.

As mentioned in the text of the chapter, the effect size for ANOVA is:

$$\displaystyle \begin{aligned}f = \sqrt{\frac{ n \sum_{j=1}^G \left(\bar{x}_{j}-\bar{x}\right)^2 }{ \sum_{j=1}^G \sum_{i=1}^n \left(x_{ij}-\bar{x}_{j}\right)^2 }} \end{aligned}$$

The quantity under the square root is the SSB divided by the Sum of Squares Within (SSW) or, in Cohen’s terms, $f = \frac {\sigma _m}{\sigma }$ (Cohen 1992). The effect size for regression is, according to Cohen (1992), $f^2 = \frac {R^2}{1 - R^2}$. It is easy to demonstrate that:

$$\displaystyle \begin{aligned}f^2 = \frac{R^2}{1 - R^2} = \frac{\mathrm{SSB}}{\mathrm{SSR}} \end{aligned}$$

where the SSW in a one-way ANOVA is comparable to the Sum of Squares of Residuals (SSR) in an OLS regression with exactly the same dependent and independent variables.

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Seri, R., Secchi, D. (2017). How Many Times Should One Run a Computational Simulation?. In: Edmonds, B., Meyer, R. (eds) Simulating Social Complexity. Understanding Complex Systems. Springer, Cham. https://doi.org/10.1007/978-3-319-66948-9_11

Download citation

DOI: https://doi.org/10.1007/978-3-319-66948-9_11
Published: 26 November 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-66947-2
Online ISBN: 978-3-319-66948-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

How Many Times Should One Run a Computational Simulation?

Abstract

Access this chapter

Similar content being viewed by others

MultiVeStA: Statistical Analysis of Economic Agent-Based Models by Statistical Model Checking

Multiple Runs in the Simulation of Stochastic Systems Can Improve the Estimates of Equilibrium Expectations

Decision Making in Agent-Based Models

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendices

Further Reading

Appendix

11.2.1 Number of Runs Calculations

11.2.2 Effect Size for ANOVA vs OLS Regression

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

How Many Times Should One Run a Computational Simulation?

Abstract

Access this chapter

Similar content being viewed by others

MultiVeStA: Statistical Analysis of Economic Agent-Based Models by Statistical Model Checking

Multiple Runs in the Simulation of Stochastic Systems Can Improve the Estimates of Equilibrium Expectations

Decision Making in Agent-Based Models

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendices

Further Reading

Appendix

11.2.1 Number of Runs Calculations

11.2.2 Effect Size for ANOVA vs OLS Regression

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us