Skip to main content
Log in

Simulation input data modeling

  • Published:
Annals of Operations Research Aims and scope Submit manuscript

Abstract

Input data modeling is a critical component of a successful simulation application. A perspective of the area is given with an emphasis on available probability distributions as models, estimation methods, model selection and discrimination, and goodness of fit. Three specific distribution classes (lambda,S B , TES processes) are discussed in some detail to illustrate characteristics that favor input models. Regarding estimation, we argue for maximum likelihood estimation over method of moments and other matching schemes due to intrinsic superior properties (presuming a specific model) and the capability of accommodating messy data types. We conclude with a list of specific research problems and areas warranting additional attention.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. M.M. Ali, N.N. Mikhail and M.S. Haq, A class of bivariate distributions including the bivariate logistic, J. Multivariate Analysis 8(1978)405–412.

    Google Scholar 

  2. B.C. Arnold,Pareto Distributions (Int. Co-operative Pub. House, Fairland, 1983).

    Google Scholar 

  3. A.C. Atkinson, A method for discriminating between models (with discussion), J. Roy. Statist. Soc. Series B 32(1970)323–344.

    Google Scholar 

  4. A.N. Avramidis and J.R. Wilson, A flexible method for estimating inverse distribution functions in simulation experiments, ORSA J. Comp. (1993) to appear.

  5. G.A. Barnard, The use of the likelihood function in statistical practice, in:Proc. 5th Berkeley Symp. on Mathematical Statistics and Probability, Vol. 1(1967) pp. 27–40.

    Google Scholar 

  6. R.J. Beckman and M.E. Johnson, Fitting the Student-t distribution to grouped data, with application to particle scattering experiment, Technometrics 29(1977)17–22.

    Google Scholar 

  7. J. Berkson, Some difficulties of interpretation encountered in the application of the chi-square test, J. Amer. Statist. Assoc. 33(1938)426–442.

    Google Scholar 

  8. P.J. Bickel and D.A. Freedman, Some asymptotics for the bootstrap, Ann. Statist. 9(1981)1196–1217.

    Google Scholar 

  9. P. Bratley, B.L. Fox and L.E. Schrage,A Guide to Simulation (Springer, New York, 1987).

    Google Scholar 

  10. J. Bukac, FittingS B curves using symmetrical percentile points, Biometrika 59(1972)688–690.

    Google Scholar 

  11. R.C.H. Cheng and T.C. Iles, Corrected maximum likelihood in non-regular problems, J. Roy. Statist. Soc. Series B 49(1987)95–101.

    Google Scholar 

  12. R.S. Chhikara and J.L. Folks,The Inverse Gaussian Distribution: Theory, Methodology, and Applications (Marcel Dekker, New York, 1989).

    Google Scholar 

  13. M.K. Clayton, Review of statistics for spatial data, J. Amer. Statist. Assoc. 88(1993)703.

    Google Scholar 

  14. A.C. Cohen and B.J. Whitten,Parameter Estimation in Reliability and Life Span Models (Marcel Dekker, New York, 1988).

    Google Scholar 

  15. P.C. Consul,Generalized Poisson Distributions: Properties and Applications (Marcel Dekker, New York, 1988).

    Google Scholar 

  16. R.D. Cook and M.E. Johnson, A family of distributions for modelling non-elliptically symmetric multivariate data, J. Roy. Statist. Soc. Series B 43(1981)210–218.

    Google Scholar 

  17. R.D. Cook and M.E. Johnson, Generalized Burr-Pareto-logistic distributions with applications to a uranium exploration data set, Technometrics 28, 2(1986)123–131.

    Google Scholar 

  18. N. Cressie and T.R.C. Read, Multinomial goodness-of-fit tests, J. Roy. Statist. Soc. Series B 46(1984)440–464.

    Google Scholar 

  19. E.L. Crow and K. Shimuzu,Lognormal Distributions: Theory and Applications (Marcel Dekker, New York, 1990).

    Google Scholar 

  20. H.A. David,Order Statistics, 2nd ed. (Wiley, New York, 1981).

    Google Scholar 

  21. G. Dall'Aglio, S. Kotz and G. Salinetti,Advances in Probability Distributions with Given Marginals (Kluwer Academic, Boston, 1991).

    Google Scholar 

  22. D.J. DeBrota, R.S. Dittus, S.D. Roberts and J.R. Wilson, Visual interactive fitting of bounded Johnson distributions, Simulation 53(1989)199–205.

    Google Scholar 

  23. D.J. DeBrota, R.S. Dittus, J.J. Swain, S.D. Roberts, J.R. Wilson and S. Venkatraman, Modeling input processes with Johnson distributions,Proc. Winter Simulation Conf. (1989) pp. 308–317.

  24. L. Devroye,Non-uniform Variate Generation (Springer, New York, 1986).

    Google Scholar 

  25. B. Efron,The Jacknife, the Bootstrap, and Other Resampling Plans (SIAM, Philadelphia, 1982).

    Google Scholar 

  26. K.T. Fang, S. Kotz and K.W. Ng,Symmetric Multivariate and Related Distributions (Chapman and Hall, New York, 1990).

    Google Scholar 

  27. S.J. Finch, N.R. Mendell and H.C. Thode Jr., Probabilistic measures of adequacy of a numerical search for global maximum, J. Amer. Statist. Assoc. 84(1989)1020–1023.

    Google Scholar 

  28. D. Freedman and P. Diaconis, On the histogram as a density estimator:L 2 theory, Zeits. Wahrscheinlichkeitstheorie und Verwandte Gebiete 57(1981)453–476.

    Google Scholar 

  29. D. Geist and B. Melamed, TEStool: An environment for visual interactive modeling of autocorrelated traffic,Proc. ICC, Vol. 3(1992) pp. 1285–1289.

    Google Scholar 

  30. C. Genest and J. Mackay, Copules Archimédiennes et familles de lois bidimensionelles dont les marges sont données, Can. J. Statist. 14(1986)145–159.

    Google Scholar 

  31. F. Giesbricht and O. Kempthorne, Maximum likelihood estimation in the three-parameter lognormal distribution, J. Roy. Statist. Soc., Series B 38(1976)257–264.

    Google Scholar 

  32. G.J. Hahn and S.S. Shapiro,Statistical Models in Engineering (Wiley, New York, 1967).

    Google Scholar 

  33. S.E. Hoffman, M.M. Crawford and J.R. Wilson, An integrated model of drilling vessel operations,Proc. Winter Simulation Conf. (1983) pp. 45–53.

  34. S.C. Hora, Estimation of the inverse function for random variate generation, Commun. ACM 26(1983)590–594.

    Google Scholar 

  35. J.R.M. Hosking, Moments ofL moments? An example comparing two measures of distributional shape, Amer. Statist. 46(1992)186–189.

    Google Scholar 

  36. P. Hougaard, Modelling multivariate survival, Scand. J. Statist. 14(1987)291–304.

    Google Scholar 

  37. T.P. Hutchinson and C.D. Lai,Continuous Bivariate Distributions, Emphasizing Applications (Rumsby Scientific, Adelaide, 1990).

    Google Scholar 

  38. D.L. Jagerman, The autocorrelation function of a sequence uniformly distributed modulo 1, Ann. Math. Statist. 34(1963)1243–1252.

    Google Scholar 

  39. D.L. Jagerman and B. Melamed, The transition and autocorrelation structure of TES processes, Part I: General theory, Commun. Statist. Stoch. Models 8(1992)193–219.

    Google Scholar 

  40. D.L. Jagerman and B. Melamed, The transition and autocorrelation structures of TES processes, Part II: Special cases, Commun. Statist. Stoch. Models 8(1992)499–527.

    Google Scholar 

  41. H. Jeffreys,Theory of Probability (Clarendon Press, Oxford, 1948).

    Google Scholar 

  42. K.H. Jöckel, G. Rothe and W. Sendler,Bootstrapping and Related Techniques, Proc., Trier, Germany 1990 (Springer, Berlin, 1992).

    Google Scholar 

  43. M.E. Johnson,Multivariate Statistical Simulation (Wiley, New York, 1987).

    Google Scholar 

  44. M.E. Johnson, Contributions to Alternative approaches for specifying input distributions and processes (Panel),Proc. Winter Simulation Conf. (1990).

  45. M.E. Johnson and A. Tenenbein, A bivariate distribution family with specified marginals, J. Amer. Statist Assoc. 76(1981)198–201.

    Google Scholar 

  46. M.E. Johnson, G.L. Tietjen and R.J. Beckman, A new family of probability distributions with applications to Monte Carlo studies, J. Amer. Statist. Assoc. 76(1980)198–210.

    Google Scholar 

  47. N.L. Johnson and S. Kotz,Distributions in Statistics: Discrete Distributions (Wiley, New York, 1969).

    Google Scholar 

  48. N.L. Johnson and S. Kotz,Distributions in Statistics: Continuous Distributions I (Wiley, New York, 1971).

    Google Scholar 

  49. N.L. Johnson and S. Kotz,Distributions in Statistics: Continuous Distributions II (Wiley, New York, 1971).

    Google Scholar 

  50. N.L. Johnson and S. Kotz,Distributions in Statistics: Continuous Multivariate Distributions (Wiley, New York, 1972).

    Google Scholar 

  51. N.L. Johnson and S. Kotz, Extended and multivariate Tukey lambda distribution, Biometrika 60(1973)655–661.

    Google Scholar 

  52. E.P.C. Kao and S. Chang, Modeling time-dependent arrivals to service systems: A case in using piecewise-polynomial rate function in a non-homogeneous Poisson process, Manag. Sci. 34(1988)1367–1379.

    Google Scholar 

  53. I. Kaplansky, A common error concerning kurtosis, J. Amer. Statist. Assoc. 40(1945)259.

    Google Scholar 

  54. W.D. Kelton, B.L. Fox, M.E. Johnson, A.M. Law, B.W. Schmeiser and J.R. Wilson, Alternative approaches for specifying input distributions and processes (Panel),Proc. Winter Simulation Conf. (1990) pp. 382–386.

  55. M. Kendall and A. Stuart,The Advanced Theory of Statistics, Vol. 2:Inference and Relationship (Charles Griffin, London, 1979).

    Google Scholar 

  56. G. Kimeldorf and A.R. Sampson, Monotone dependence, Ann. Statist. 6(1978)895–903.

    Google Scholar 

  57. G. Kimeldorf and A. Sampson, One-parameter families of bivariate distributions with fixed marginals, Commun. Statist. 4(1975)293–301.

    Google Scholar 

  58. G. Kimeldorf and A. Sampson, Uniform representations of bivariate distributions, Commun. Statist. 4(1975)617–627.

    Google Scholar 

  59. W. Kirby, Algebraic boundedness of sample statistics, Water Resources Res. 1(1974)220–222.

    Google Scholar 

  60. A.M. Law and W.D. Kelton,Simulation Modelling and Analysis (McGraw-Hill, New York, 1991).

    Google Scholar 

  61. J.F. Lawless,Statistical Models and Methods for Lifetime Data (Wiley, New York, 1982).

    Google Scholar 

  62. D.S. Lee, B. Melamed, A. Reibman and B. Sengupta, Analysis of a video multiplexer using TES as a modeling methodology,Proc. IEEE GLOBECOM, Vol. 1(1991) pp. 16–20.

    Google Scholar 

  63. D.S. Lee, B. Melamed, A. Reibman and B. Sengupta, TES modeling for analysis of a video multiplexer, Perf. Eval. 16(1992)21–34.

    Google Scholar 

  64. S. Lee, J.R. Wilson and M.M. Crawford, Modeling and simulation of a nonhomogeneous Poisson process having cyclic behavior, Commun. Statist. Simul. Comp. 20(1991)777–809.

    Google Scholar 

  65. D.L. Libby, M.R. Novick, J.J. Chen, G.G. Woodworth and R.M. Hamer, The computer-assisted data analysis (CADA) monitor, Amer. Statist. 35(1981)165–166.

    Google Scholar 

  66. B.G. Lindsay and P. Basak, Multivariate normal mixtures: A fast consistent method of moments, J. Amer. Statist. Assoc. 88(1993)468–476.

    Google Scholar 

  67. B.W. Lingren,Statistical Theory (Macmillan, New York, 1976).

    Google Scholar 

  68. M. Livny, B. Melamed and A.K. Tsiolis, The impact of autocorrelation on queueing systems, Manag. Sci. 39(1993)322–339.

    Google Scholar 

  69. D.T. Mage, An explicit solution forS B parameters using percentile points, Technometrics 21(1980)377–378.

    Google Scholar 

  70. K.V. Mardia,Families of Bivariate Distributions (Hafner, Darien, 1970).

    Google Scholar 

  71. K.V. Mardia, Measures of multivariate skewness and kurtosis, Biometrika 57(1970)519–530.

    Google Scholar 

  72. B. Melamed, TES: A class of methods for generating autocorrelated uniform variates, ORSA J. Comp. 3(1991)317–329.

    Google Scholar 

  73. B. Melamed, An overview of TES processes and modeling methodology (1993), submitted.

  74. B. Melamed, J.R. Hill and D. Goldsman, The TES methodology: Modeling empirical stationary time series,Proc. Winter Simulation Conf. (1992) pp. 135–144.

  75. B. Melamed, D. Raychaudhuri, B. Sengupta and J. Zdepski, TES-based traffic modeling for performance evaluation of integrated networks,Proc. INFOCOM, Vol. 1(1992) pp. 75–84.

    Google Scholar 

  76. B. Melamed, D. Reininger, D. Raychaudhuri, B. Sengupta and J. Hill, Statistical multiplexing of VBR MPEG compressed video on ATM networks,Proc. INFOCOM (1993)919–926.

  77. B. Melamed and B. Sengupta, TES modeling of video traffic, IEICE Trans. Commun. E75-B(1992) 1292–1300.

  78. W. Nelson,Applied Life Data Analysis (Wiley, New York, 1982).

    Google Scholar 

  79. M.R. Novick, A course in Bayesian statistics, Amer. Statist. 29(1975)94–101.

    Google Scholar 

  80. M.R. Novick, R.M. Hamer and J.J. Chen, The computer-assisted data analysis (CADA) monitor, Amer. Statist. 33(1979)219–220.

    Google Scholar 

  81. G.P. Patil, M.T. Boswell, M.V. Ratnaparkhi and J.J.J. Roux,Dictionary and Classified Bibliography of Statistical Distributions in Scientific Work, Vol. 3:Multivariate Models (Int. Co-operative Publ. House, Fairland, 1984)

    Google Scholar 

  82. K. Pearson, Contribution to the mathematical theory of evolution, Phil. Trans. Roy. Soc., Series A 185(1894)71–110.

    Google Scholar 

  83. S. P. Pederson and M.E. Johnson, Estimating model discrepancy, Technometrics 32(1990)305–314.

    Google Scholar 

  84. R.L. Prentice, A log gamma model and its maximum likelihood estimation, Biometrika 61(1974)539–544.

    Google Scholar 

  85. J.S. Ramberg and B.W. Schmeiser, An approximate method for generating symmetric random variables, Commun. ACM 15(1972)987–990.

    Google Scholar 

  86. J.S. Ramberg and B.W. Schmeiser, An approximate method for generating asymmetric random variables, Commun. ACM 17(1974)78–82.

    Google Scholar 

  87. J.S. Ramberg, E.J. Dudewicz, P.R. Tadikamalla and E.F. Mykytka, A probability distribution and its uses in fitting data, Technometrics 21(1979)201–214.

    Google Scholar 

  88. R.H. Randles, J.S. Broffitt, J.S. Ramberg and R.V. Hogg, Discriminant analysis based on ranks, J. Amer. Statist. Assoc. 73(1978)379–384.

    Google Scholar 

  89. M. Scarcini and A. Venetoulias, Bivariate distributions with nonmonotone dependence structure, J. Amer. Statist. Assoc. 88(1993)338–344.

    Google Scholar 

  90. B.W. Schmeiser, Contributions to “Alternative approaches for specifying input distributions and processes” (Panel),Proc. Winter Simulation Conf. (1990).

  91. D.W. Scott, On optimal and data-based histograms, Biometrika 66(1979)605–610.

    Google Scholar 

  92. A. Shanker and W.D. Kelton, Empirical input distributions: An alternative to standard input distributions in simulation modeling,Proc. Winter Simulation Conf. (1991) pp. 978–985.

  93. J.F. Slifker and S.S. Shapiro, The Johnson system: Selection and parameter estimation, Technometrics 22(1980)239–246.

    Google Scholar 

  94. R.L. Smith, Maximum likelihood estimation in a class of non-regular cases, Biometrika 72(1985)67–90.

    Google Scholar 

  95. J.J. Swain, S.V. Venkatraman and J.R. Wilson, Least-squares estimation of distribution functions in Johnson's translation system, J. Statist. Comp. Simul. 29(1988)271–298.

    Google Scholar 

  96. D.M. Titterington, A.F.M. Smith and U.E. Makov,Statistical Analysis of Finite Mixture Distributions (Wiley, New York, 1985).

    Google Scholar 

  97. Y.L Tong,The Multivariate Normal Distribution, (Springer, New York, 1990).

    Google Scholar 

  98. J.W. Tukey, The practical relationship between the common transformations of percentages of counts and amounts, Technical Report 36, Princeton University (1960).

  99. I. Weissman, Estimation of parameters and large quantiles based on thek largest observations, J. Amer. Statist. Assoc. 73(1978)812–815.

    Google Scholar 

  100. D.A. Williams, Discussion of a method for discriminating between models, J. Boy. Statist. Soc. Series B 32(1970)350.

    Google Scholar 

  101. D. Zelterman, A semiparametric bootstrap technique for simulating extreme order statistics, J. Amer. Statist Assoc. 88(1993)477–484.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Johnson, M.E., Mollaghasemi, M. Simulation input data modeling. Ann Oper Res 53, 47–75 (1994). https://doi.org/10.1007/BF02136826

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02136826

Keywords

Navigation