Abstract
Input data modeling is a critical component of a successful simulation application. A perspective of the area is given with an emphasis on available probability distributions as models, estimation methods, model selection and discrimination, and goodness of fit. Three specific distribution classes (lambda,S B , TES processes) are discussed in some detail to illustrate characteristics that favor input models. Regarding estimation, we argue for maximum likelihood estimation over method of moments and other matching schemes due to intrinsic superior properties (presuming a specific model) and the capability of accommodating messy data types. We conclude with a list of specific research problems and areas warranting additional attention.
Similar content being viewed by others
References
M.M. Ali, N.N. Mikhail and M.S. Haq, A class of bivariate distributions including the bivariate logistic, J. Multivariate Analysis 8(1978)405–412.
B.C. Arnold,Pareto Distributions (Int. Co-operative Pub. House, Fairland, 1983).
A.C. Atkinson, A method for discriminating between models (with discussion), J. Roy. Statist. Soc. Series B 32(1970)323–344.
A.N. Avramidis and J.R. Wilson, A flexible method for estimating inverse distribution functions in simulation experiments, ORSA J. Comp. (1993) to appear.
G.A. Barnard, The use of the likelihood function in statistical practice, in:Proc. 5th Berkeley Symp. on Mathematical Statistics and Probability, Vol. 1(1967) pp. 27–40.
R.J. Beckman and M.E. Johnson, Fitting the Student-t distribution to grouped data, with application to particle scattering experiment, Technometrics 29(1977)17–22.
J. Berkson, Some difficulties of interpretation encountered in the application of the chi-square test, J. Amer. Statist. Assoc. 33(1938)426–442.
P.J. Bickel and D.A. Freedman, Some asymptotics for the bootstrap, Ann. Statist. 9(1981)1196–1217.
P. Bratley, B.L. Fox and L.E. Schrage,A Guide to Simulation (Springer, New York, 1987).
J. Bukac, FittingS B curves using symmetrical percentile points, Biometrika 59(1972)688–690.
R.C.H. Cheng and T.C. Iles, Corrected maximum likelihood in non-regular problems, J. Roy. Statist. Soc. Series B 49(1987)95–101.
R.S. Chhikara and J.L. Folks,The Inverse Gaussian Distribution: Theory, Methodology, and Applications (Marcel Dekker, New York, 1989).
M.K. Clayton, Review of statistics for spatial data, J. Amer. Statist. Assoc. 88(1993)703.
A.C. Cohen and B.J. Whitten,Parameter Estimation in Reliability and Life Span Models (Marcel Dekker, New York, 1988).
P.C. Consul,Generalized Poisson Distributions: Properties and Applications (Marcel Dekker, New York, 1988).
R.D. Cook and M.E. Johnson, A family of distributions for modelling non-elliptically symmetric multivariate data, J. Roy. Statist. Soc. Series B 43(1981)210–218.
R.D. Cook and M.E. Johnson, Generalized Burr-Pareto-logistic distributions with applications to a uranium exploration data set, Technometrics 28, 2(1986)123–131.
N. Cressie and T.R.C. Read, Multinomial goodness-of-fit tests, J. Roy. Statist. Soc. Series B 46(1984)440–464.
E.L. Crow and K. Shimuzu,Lognormal Distributions: Theory and Applications (Marcel Dekker, New York, 1990).
H.A. David,Order Statistics, 2nd ed. (Wiley, New York, 1981).
G. Dall'Aglio, S. Kotz and G. Salinetti,Advances in Probability Distributions with Given Marginals (Kluwer Academic, Boston, 1991).
D.J. DeBrota, R.S. Dittus, S.D. Roberts and J.R. Wilson, Visual interactive fitting of bounded Johnson distributions, Simulation 53(1989)199–205.
D.J. DeBrota, R.S. Dittus, J.J. Swain, S.D. Roberts, J.R. Wilson and S. Venkatraman, Modeling input processes with Johnson distributions,Proc. Winter Simulation Conf. (1989) pp. 308–317.
L. Devroye,Non-uniform Variate Generation (Springer, New York, 1986).
B. Efron,The Jacknife, the Bootstrap, and Other Resampling Plans (SIAM, Philadelphia, 1982).
K.T. Fang, S. Kotz and K.W. Ng,Symmetric Multivariate and Related Distributions (Chapman and Hall, New York, 1990).
S.J. Finch, N.R. Mendell and H.C. Thode Jr., Probabilistic measures of adequacy of a numerical search for global maximum, J. Amer. Statist. Assoc. 84(1989)1020–1023.
D. Freedman and P. Diaconis, On the histogram as a density estimator:L 2 theory, Zeits. Wahrscheinlichkeitstheorie und Verwandte Gebiete 57(1981)453–476.
D. Geist and B. Melamed, TEStool: An environment for visual interactive modeling of autocorrelated traffic,Proc. ICC, Vol. 3(1992) pp. 1285–1289.
C. Genest and J. Mackay, Copules Archimédiennes et familles de lois bidimensionelles dont les marges sont données, Can. J. Statist. 14(1986)145–159.
F. Giesbricht and O. Kempthorne, Maximum likelihood estimation in the three-parameter lognormal distribution, J. Roy. Statist. Soc., Series B 38(1976)257–264.
G.J. Hahn and S.S. Shapiro,Statistical Models in Engineering (Wiley, New York, 1967).
S.E. Hoffman, M.M. Crawford and J.R. Wilson, An integrated model of drilling vessel operations,Proc. Winter Simulation Conf. (1983) pp. 45–53.
S.C. Hora, Estimation of the inverse function for random variate generation, Commun. ACM 26(1983)590–594.
J.R.M. Hosking, Moments ofL moments? An example comparing two measures of distributional shape, Amer. Statist. 46(1992)186–189.
P. Hougaard, Modelling multivariate survival, Scand. J. Statist. 14(1987)291–304.
T.P. Hutchinson and C.D. Lai,Continuous Bivariate Distributions, Emphasizing Applications (Rumsby Scientific, Adelaide, 1990).
D.L. Jagerman, The autocorrelation function of a sequence uniformly distributed modulo 1, Ann. Math. Statist. 34(1963)1243–1252.
D.L. Jagerman and B. Melamed, The transition and autocorrelation structure of TES processes, Part I: General theory, Commun. Statist. Stoch. Models 8(1992)193–219.
D.L. Jagerman and B. Melamed, The transition and autocorrelation structures of TES processes, Part II: Special cases, Commun. Statist. Stoch. Models 8(1992)499–527.
H. Jeffreys,Theory of Probability (Clarendon Press, Oxford, 1948).
K.H. Jöckel, G. Rothe and W. Sendler,Bootstrapping and Related Techniques, Proc., Trier, Germany 1990 (Springer, Berlin, 1992).
M.E. Johnson,Multivariate Statistical Simulation (Wiley, New York, 1987).
M.E. Johnson, Contributions to Alternative approaches for specifying input distributions and processes (Panel),Proc. Winter Simulation Conf. (1990).
M.E. Johnson and A. Tenenbein, A bivariate distribution family with specified marginals, J. Amer. Statist Assoc. 76(1981)198–201.
M.E. Johnson, G.L. Tietjen and R.J. Beckman, A new family of probability distributions with applications to Monte Carlo studies, J. Amer. Statist. Assoc. 76(1980)198–210.
N.L. Johnson and S. Kotz,Distributions in Statistics: Discrete Distributions (Wiley, New York, 1969).
N.L. Johnson and S. Kotz,Distributions in Statistics: Continuous Distributions I (Wiley, New York, 1971).
N.L. Johnson and S. Kotz,Distributions in Statistics: Continuous Distributions II (Wiley, New York, 1971).
N.L. Johnson and S. Kotz,Distributions in Statistics: Continuous Multivariate Distributions (Wiley, New York, 1972).
N.L. Johnson and S. Kotz, Extended and multivariate Tukey lambda distribution, Biometrika 60(1973)655–661.
E.P.C. Kao and S. Chang, Modeling time-dependent arrivals to service systems: A case in using piecewise-polynomial rate function in a non-homogeneous Poisson process, Manag. Sci. 34(1988)1367–1379.
I. Kaplansky, A common error concerning kurtosis, J. Amer. Statist. Assoc. 40(1945)259.
W.D. Kelton, B.L. Fox, M.E. Johnson, A.M. Law, B.W. Schmeiser and J.R. Wilson, Alternative approaches for specifying input distributions and processes (Panel),Proc. Winter Simulation Conf. (1990) pp. 382–386.
M. Kendall and A. Stuart,The Advanced Theory of Statistics, Vol. 2:Inference and Relationship (Charles Griffin, London, 1979).
G. Kimeldorf and A.R. Sampson, Monotone dependence, Ann. Statist. 6(1978)895–903.
G. Kimeldorf and A. Sampson, One-parameter families of bivariate distributions with fixed marginals, Commun. Statist. 4(1975)293–301.
G. Kimeldorf and A. Sampson, Uniform representations of bivariate distributions, Commun. Statist. 4(1975)617–627.
W. Kirby, Algebraic boundedness of sample statistics, Water Resources Res. 1(1974)220–222.
A.M. Law and W.D. Kelton,Simulation Modelling and Analysis (McGraw-Hill, New York, 1991).
J.F. Lawless,Statistical Models and Methods for Lifetime Data (Wiley, New York, 1982).
D.S. Lee, B. Melamed, A. Reibman and B. Sengupta, Analysis of a video multiplexer using TES as a modeling methodology,Proc. IEEE GLOBECOM, Vol. 1(1991) pp. 16–20.
D.S. Lee, B. Melamed, A. Reibman and B. Sengupta, TES modeling for analysis of a video multiplexer, Perf. Eval. 16(1992)21–34.
S. Lee, J.R. Wilson and M.M. Crawford, Modeling and simulation of a nonhomogeneous Poisson process having cyclic behavior, Commun. Statist. Simul. Comp. 20(1991)777–809.
D.L. Libby, M.R. Novick, J.J. Chen, G.G. Woodworth and R.M. Hamer, The computer-assisted data analysis (CADA) monitor, Amer. Statist. 35(1981)165–166.
B.G. Lindsay and P. Basak, Multivariate normal mixtures: A fast consistent method of moments, J. Amer. Statist. Assoc. 88(1993)468–476.
B.W. Lingren,Statistical Theory (Macmillan, New York, 1976).
M. Livny, B. Melamed and A.K. Tsiolis, The impact of autocorrelation on queueing systems, Manag. Sci. 39(1993)322–339.
D.T. Mage, An explicit solution forS B parameters using percentile points, Technometrics 21(1980)377–378.
K.V. Mardia,Families of Bivariate Distributions (Hafner, Darien, 1970).
K.V. Mardia, Measures of multivariate skewness and kurtosis, Biometrika 57(1970)519–530.
B. Melamed, TES: A class of methods for generating autocorrelated uniform variates, ORSA J. Comp. 3(1991)317–329.
B. Melamed, An overview of TES processes and modeling methodology (1993), submitted.
B. Melamed, J.R. Hill and D. Goldsman, The TES methodology: Modeling empirical stationary time series,Proc. Winter Simulation Conf. (1992) pp. 135–144.
B. Melamed, D. Raychaudhuri, B. Sengupta and J. Zdepski, TES-based traffic modeling for performance evaluation of integrated networks,Proc. INFOCOM, Vol. 1(1992) pp. 75–84.
B. Melamed, D. Reininger, D. Raychaudhuri, B. Sengupta and J. Hill, Statistical multiplexing of VBR MPEG compressed video on ATM networks,Proc. INFOCOM (1993)919–926.
B. Melamed and B. Sengupta, TES modeling of video traffic, IEICE Trans. Commun. E75-B(1992) 1292–1300.
W. Nelson,Applied Life Data Analysis (Wiley, New York, 1982).
M.R. Novick, A course in Bayesian statistics, Amer. Statist. 29(1975)94–101.
M.R. Novick, R.M. Hamer and J.J. Chen, The computer-assisted data analysis (CADA) monitor, Amer. Statist. 33(1979)219–220.
G.P. Patil, M.T. Boswell, M.V. Ratnaparkhi and J.J.J. Roux,Dictionary and Classified Bibliography of Statistical Distributions in Scientific Work, Vol. 3:Multivariate Models (Int. Co-operative Publ. House, Fairland, 1984)
K. Pearson, Contribution to the mathematical theory of evolution, Phil. Trans. Roy. Soc., Series A 185(1894)71–110.
S. P. Pederson and M.E. Johnson, Estimating model discrepancy, Technometrics 32(1990)305–314.
R.L. Prentice, A log gamma model and its maximum likelihood estimation, Biometrika 61(1974)539–544.
J.S. Ramberg and B.W. Schmeiser, An approximate method for generating symmetric random variables, Commun. ACM 15(1972)987–990.
J.S. Ramberg and B.W. Schmeiser, An approximate method for generating asymmetric random variables, Commun. ACM 17(1974)78–82.
J.S. Ramberg, E.J. Dudewicz, P.R. Tadikamalla and E.F. Mykytka, A probability distribution and its uses in fitting data, Technometrics 21(1979)201–214.
R.H. Randles, J.S. Broffitt, J.S. Ramberg and R.V. Hogg, Discriminant analysis based on ranks, J. Amer. Statist. Assoc. 73(1978)379–384.
M. Scarcini and A. Venetoulias, Bivariate distributions with nonmonotone dependence structure, J. Amer. Statist. Assoc. 88(1993)338–344.
B.W. Schmeiser, Contributions to “Alternative approaches for specifying input distributions and processes” (Panel),Proc. Winter Simulation Conf. (1990).
D.W. Scott, On optimal and data-based histograms, Biometrika 66(1979)605–610.
A. Shanker and W.D. Kelton, Empirical input distributions: An alternative to standard input distributions in simulation modeling,Proc. Winter Simulation Conf. (1991) pp. 978–985.
J.F. Slifker and S.S. Shapiro, The Johnson system: Selection and parameter estimation, Technometrics 22(1980)239–246.
R.L. Smith, Maximum likelihood estimation in a class of non-regular cases, Biometrika 72(1985)67–90.
J.J. Swain, S.V. Venkatraman and J.R. Wilson, Least-squares estimation of distribution functions in Johnson's translation system, J. Statist. Comp. Simul. 29(1988)271–298.
D.M. Titterington, A.F.M. Smith and U.E. Makov,Statistical Analysis of Finite Mixture Distributions (Wiley, New York, 1985).
Y.L Tong,The Multivariate Normal Distribution, (Springer, New York, 1990).
J.W. Tukey, The practical relationship between the common transformations of percentages of counts and amounts, Technical Report 36, Princeton University (1960).
I. Weissman, Estimation of parameters and large quantiles based on thek largest observations, J. Amer. Statist. Assoc. 73(1978)812–815.
D.A. Williams, Discussion of a method for discriminating between models, J. Boy. Statist. Soc. Series B 32(1970)350.
D. Zelterman, A semiparametric bootstrap technique for simulating extreme order statistics, J. Amer. Statist Assoc. 88(1993)477–484.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Johnson, M.E., Mollaghasemi, M. Simulation input data modeling. Ann Oper Res 53, 47–75 (1994). https://doi.org/10.1007/BF02136826
Issue Date:
DOI: https://doi.org/10.1007/BF02136826