Elsevier

Performance Evaluation

Volume 54, Issue 1, September 2003, Pages 1-32
Performance Evaluation

Acyclic discrete phase type distributions: properties and a parameter estimation algorithm

https://doi.org/10.1016/S0166-5316(03)00044-0Get rights and content

Abstract

This paper provides a detailed study on discrete phase type (DPH) distributions and its acyclic subclass referred to as acyclic-DPH (ADPH). Previously not considered similarities and differences between DPH and continuous phase type (CPH) distributions are investigated and minimal representations, called canonical forms, for the subclass of ADPH distributions are provided. We investigate the consequences of the recent result about the minimal coefficient of variation of the DPH class [The minimal coefficient of variation of discrete phase type distributions, in: Proceedings of the Third International Conference on Matrix-analytic Methods in Stochastic Models, July 2000] and show that below a given order (that is a function of the expected value) the minimal coefficient of variation of the DPH class is always less than the minimal coefficient of variation of the CPH class. Since all the previously introduced Phase Type fitting methods were designed for fitting over the CPH class we provide a DPH fitting method for the first time. The implementation of the DPH fitting algorithm is found to be simple and stable. The algorithm is tested over a benchmark consisting of 10 different continuous distributions. The error resulted when a continuous distribution sampled in discrete points is fitted by a DPH is also considered.

Introduction

Discrete phase type (DPH) distributions have been introduced and formalized in [10], but they have received little attention in applied stochastic modeling since then, because the main research activity and application oriented work was addressed towards continuous phase type (CPH) distributions [11].

However, in recent years a new attention has been devoted to discrete models since it has been observed that they can be utilized in the numerical solution of non-Markovian processes, or they are more closely related to physical observations [15], [16]. Moreover, new emphasis has been put on discrete stochastic Petri Nets [5], [6], [17]. Finally, DPHs may have a wide range of applicability in stochastic models in which random times must be combined with constant durations. In fact, one of the most interesting property of the DPH distributions is that they can represent in an exact way a number of distributions with finite support, like the deterministic and the (discrete) uniform, and hence one can mix inside the same formalism distributions with finite and infinite support.

In particular, while it is known that the minimal coefficient of variation for the CPH family depends only on the order n and is attained by the Erlang distribution [1] (cv=1/n), it is trivial to show, for the DPH family, that for any order n the deterministic distribution with cv=0 is a member of the family. Since, the range of applicability of the PH distributions may depend on the range of variability of the coefficient of variation given the order n, it is interesting to investigate, for the DPH family, how the coefficient of variation depends on the model parameters.

The convenience of using the DPH family in applied stochastic modeling has motivated the present paper whose aim is to investigate more closely the properties of the DPH family and to provide results that can be profitably exploited for the implementation of an algorithm to estimate the model parameters given an assigned distribution or a set of experimental points [4].

The DPH representation of a given distribution function is, in general, non-unique [13] and non-minimal. Hence, we first explore a subclass of the DPH class for which the representation is an acyclic graph (acyclic-DPH–ADPH) and we show that, similarly to the continuous case [8], the ADPH class admits a unique minimal representation, called canonical form.

We recall the theorem about the minimal coefficient of variation of the DPH class as a function of the order and of the mean [18]. This theorem shows that below a given order (that is a function of the mean) the minimal coefficient of variation of the DPH class is always less than the minimal coefficient of variation of the CPH class. This result, combined with the well known result of [1] (the minimal cv for an n-phase CPH distribution is 1/n independent of its mean), offers the possibility of comparing the applicability of the CPH and DPH families to fit distributions with low cv.

An algorithm is presented for the estimation of the ADPH model parameters to fit distributions or a set of experimental data. The algorithm is based on the maximum likelihood (ML) principle. A z-transform version of the algorithm is derived from the continuous case [3], while a novel time domain version is provided. It is shown that the time domain algorithm is easier to implement and more stable. The algorithm is then tested on a benchmark of 10 different continuous distributions that have been already utilized for a similar study in the continuous case [4]. However, since a continuous distribution needs to be discretized in order to feed the fitting algorithm, the role of the discretization interval on the performance of the algorithm and on the goodness of the fit is extensively discussed.

The structure of the paper is as follows. Section 2 introduces the basic definitions and notation, and provides a simple example to emphasize some differences between the CPH and DPH class, differences that are not evident from a comparative analysis reported for instance in [9]. Section 3 derives the canonical form (and their main properties) for the class of acyclic-DPH (ADPH). Section 4 gives the theorem to describe the minimal coefficient of variation for the DPH class as a function of the order and of the mean and shows the shape of the structures that realize minimal coefficient of variation. Section 5 presents the ML estimation algorithm, both in z-transform domain and in time domain. Section 6 discusses the role of the discretization interval on the accuracy of the obtainable approximation, while Section 7 is devoted to present the results of the benchmark analysis. Finally, Section 8 concludes the paper.

Section snippets

Definition and notation

A DPH distribution [10], [11] is the distribution of the time until absorption in a discrete-state discrete-time Markov chain (DTMC) with n transient states, and one absorbing state. (The case when n=∞ is not considered in this paper.) If the transient states are numbered 1,2,…,n and the absorbing state is numbered (n+1), the one-step transition probability matrix of the corresponding DTMC can be partitioned as B̂=Bb01,where B=[bij] is the (n×n) matrix grouping the transition probabilities

Acyclic-DPHs

Definition 1

A DPH is called acyclic-DPH (ADPH) if its states can be ordered in such a way that matrix B is an upper triangular matrix.

By Definition 1, a generic ADPH of order n is characterized by NF=(n2+3n−2)/2 free parameters (n(n+1)/2 in the upper triangular matrix B and n−1 in the initial probability vector α).

Definition 1 implies that a state i can be directly connected to a state j only if ji. In an ADPH, each state is visited only once before absorption. We define an absorbing path, or simply a path

Comparing the minimal coefficient of variation for CPH and DPH

It has been shown in Section 2.1, that a deterministic distribution with cv=0 is a member of the DPH as well as the ADPH class (4), and moreover that the minimal cv depends on the mean. Since the flexibility in approximating a given distribution function may depend on the range of variability of the coefficient of variation, in this section we compare the CPH and DPH families from the point of minimal coefficient of variation. For this purpose we recall the theorem that describes the minimal

A fitting algorithm for parameter estimation

We describe a fitting algorithm for estimating the parameters of an ADPH in CF1 form, based on the ML principle [3], [4]. We first derive the closed form expression for the pmf both in the z-transform domain and in the time domain, and for its derivatives with respect to the model parameters, then the implemented ML estimation algorithm is briefly sketched. The range of applicability of both techniques is finally discussed.

Approximating continuous distributions

When using ADPH distributions to approximate random variables arising in practical problems, there are cases in which a discrete sample of data points is directly derived from the application. But there are also cases in which the distributions to be approximated are not discrete. For example, ADPH distributions can be utilized to approximate continuous distributions.

The ADPH approximation of a continuous distribution requires two steps:

  • (1)

    The distribution is discretized according to a given

Examples for the estimation process

This section reports the results of the numerical experiments that have been carried out to test the goodness of fit of the proposed ML fitting algorithm. The experiments are based on a benchmark (composed of continuous distributions only) already proposed in [4] to test the goodness of fit of algorithms for CPH distributions (the origin and the motivations behind the proposed benchmark are discussed in [4]). Hence, the present results allows us to compare the features of the discrete and the

Conclusion

Some previously not considered properties of the DPH distributions, which are essential for DPH fitting, are investigated and compared with the known properties of the CPH distributions. Similarly to the continuous family, acyclic-DPH distributions admit a minimal representation called canonical form. Resorting to the canonical form, we have investigated the dependence of the minimal squared coefficient of variation on the mean and on the order, and we have established the conditions for which

Acknowledgements

This work was partially supported by Hungarian Scientific Research Fund (OTKA) under Grant No. T-34972.

A. Bobbio graduated in nuclear engineering from Politecnico di Torino in 1969. In 1971 he joined the Istituto Elettrotecnico Nazionale Galileo Ferraris di Torino, where he was involved in activities connected with the organization of a “Reliability Club” (Circolo dell’Affidabilita). In 1992, he become Associate Professor at the Department of “Elettronica per l‘Automazione” of the University of Brescia, and in 1995 he moved to the Department of Informatica of the University of Torino. In 2000,

References (18)

  • A. Cumani

    On the canonical representation of homogeneous Markov processes modelling failure-time distributions

    Microelectr. Reliab.

    (1982)
  • D. Aldous et al.

    The least variable phase type distribution is Erlang

    Stoch. Mod.

    (1987)
  • D. Assaf et al.

    Closure of phase type distributions under operations arising in reliability theory

    Ann. Probab.

    (1982)
  • A. Bobbio, A. Cumani, ML estimation of the parameters of a PH distribution in triangular canonical form, in: G. Balbo,...
  • A. Bobbio et al.

    A benchmark for PH estimation algorithms: results for acyclic-PH

    Stoch. Mod.

    (1994)
  • G. Ciardo, Discrete-time markovian stochastic Petri nets, in: Proceedings of the Second International Workshop on...
  • G. Ciardo, R. Zijal, Discrete deterministic and stochastic Petri nets, ICASE Technical Report, No. 96-72, NASA, 1996,...
  • C. Commault et al.

    An invariant of representations of phase-type distributions and some applications

    J. Appl. Probab.

    (1996)
  • R.S. Maier

    The algebraic construction of phase-type distributions

    Stoch. Mod.

    (1991)
There are more references available in the full text version of this article.

Cited by (96)

  • Convex Stochastic Bounds and Stochastic Optimisation on Graphs

    2018, Electronic Notes in Theoretical Computer Science
  • Optimizing the performance of sensor network programs through estimation-based code profiling

    2017, Pervasive and Mobile Computing
    Citation Excerpt :

    We apply the approach proposed by L.J.R. Esparza [35], which is based on EM algorithm [36], to searching proper parameters for our model. A few other MLE (maximum likelihood estimation) based approaches were also proposed [35,37–40]. Several critical implementation details of Algorithm 1 are introduced in this section.

View all citing articles on Scopus

A. Bobbio graduated in nuclear engineering from Politecnico di Torino in 1969. In 1971 he joined the Istituto Elettrotecnico Nazionale Galileo Ferraris di Torino, where he was involved in activities connected with the organization of a “Reliability Club” (Circolo dell’Affidabilita). In 1992, he become Associate Professor at the Department of “Elettronica per l‘Automazione” of the University of Brescia, and in 1995 he moved to the Department of Informatica of the University of Torino. In 2000, he become full professor at the Universita‘ del Piemonte Orientale of Alessandria, Italy. His activity is mainly focused on the modeling and analysis of the performance and reliability of stochastic systems, with particular emphasis on Markov chains and stochastic Petri nets. He has spent various research periods at the Department of Computer Science of the Duke University (Durham NC, USA), at the Technical University of Budapest and at the Department of Computer Science and Engineering at the Indian Institute of Technology in Kanpur (India). He has been principal investigator and leader of research groups in various research projects with public and private institutions. He is an author of several papers in international journals as well as communications to international conferences and workshops.

A. Horváth was born in 1974 in Budapest where he received the M.Sc. degree in computer science from the University of Technology and Economics. From 1998 to 2002 he was a Ph.D. student supervised by Miklós Telek at the same university. From 2003 he is a researcher at the University of Turin (Italy). His research interests are in the area of stochastic processes including performance analysis of non-Markovian systems and modeling issues of communication networks.

M. Scarpa received his degree in computer engineering in 1994, from University of Catania, Italy, and the Ph.D. degree in computer science in 2000, from University of Turin, Italy. His main research interests during the Ph.D. period included the study of stochastic Petri nets with generally distributed firing time in presence of different firing policy, and algorithms for their numerical solution. During 2000–2001 he was engaged with the Faculty of Engineer of Catania University as Assistant Professor in “Fundamentals of Computer Science”. Currently he is Assistant Professor of computer engineering at the Messina University. He coordinated the development of the software package WEBSPN, a WEB accessible tool to solve stochastic Petri nets with non-exponentially distributed firing time transitions. His interests include performance and reliability modelling of distributed and real time systems, phase type distributions, communication protocols modelling and parallel algorithms for solutions of stochastic models. He has been involved in several national projects.

M. Telek received the M.Sc. degree in electrical engineering from the Technical University of Budapest in 1987. After graduation he joined the Hungarian Post Research Institute where he studied the modelling, analysis and planning aspects of communication networks. Since 1990 he has been with the Department of Telecommunications of the Technical University of Budapest, where he is an Associate Professor now. He received the candidate of science degree from the Hungarian Academy of Science in 1995. His current research interest includes stochastic performance modeling and analysis of computer and communication systems.

This work has been performed under a cooperative visiting exchange program supported by the Italian Ministry of Foreign Affairs and the Hungarian Ministry of Education (TÉT).

View full text