Skip to main content
Log in

Classification using sequential order statistics

  • Regular Article
  • Published:
Advances in Data Analysis and Classification Aims and scope Submit manuscript

Abstract

Whereas discrimination methods and their error probabilities were broadly investigated for common data distributions such as the multivariate normal or t-distributions, this paper considers the case when the recorded data are assumed to be observations from sequential order statistics. Random vectors of sequential order statistics describe, e.g., successive failures in a k-out-of-n system or in other coherent and load sharing systems allowing for changes of underlying lifetime distributions caused by component failures. Within this framework, the Bayesian two-class discrimination approach with known prior probabilities and class parameters is considered, and exact and asymptotic formulas for the error probabilities in terms of Erlang and hypoexponential distributions are derived. Since the Bayesian classifier is closely related to Kullback–Leibler’s information distance, this approach is extended by invoking other divergence measures such as Jeffreys and Rényi’s distance. While exact formulas for the misclassification rates of the resulting distance-based classifiers are not available, inequalities among the corresponding error probabilities are derived. The performance of the applied classifiers is illustrated by some simulation results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  • Amari SV, Misra RB (1997) Closed-form expressions for distribution of sum of exponential random variables. IEEE Trans Reliab 46(4):519–522

    Google Scholar 

  • Anderson TW (2003) An introduction to multivariate statistical analysis, 3rd edn. Wiley, Hoboken

    MATH  Google Scholar 

  • Balakrishnan N, Beutner E, Kamps U (2011) Modeling parameters of a load-sharing system through link functions in sequential order statistics models and associated inference. IEEE Trans Reliab 60:605–611

    Google Scholar 

  • Balakrishnan N, Beutner E, Kamps U (2012) A sequential order statistics approach to step-stress testing. Ann Inst Stat Math 64:303–318

    MathSciNet  MATH  Google Scholar 

  • Bedbur S (2011) Models of ordered random variables and exponential families. PhD Thesis, RWTH Aachen University, Germany

  • Bedbur S, Beutner E, Kamps U (2012) Generalized order statistics: an exponential family in model parameters. Statistics 46(2):159–166

    MathSciNet  MATH  Google Scholar 

  • Bedbur S, Beutner E, Kamps U (2014) Multivariate testing and model-checking for generalized order statistics with applications. Statistics 48(6):1297–1310

    MathSciNet  MATH  Google Scholar 

  • Bedbur S, Johnen M, Kamps U (2019) Inference from multiple samples of Weibull sequential order statistics. J Multivar Anal 169:381–399

    MathSciNet  MATH  Google Scholar 

  • Beutner E (2008) Nonparametric inference for sequential \(k\)-out-of-\(n\) systems. Ann Inst Stat Math 60:605–626

    MathSciNet  MATH  Google Scholar 

  • Beutner E, Kamps U (2009) Order restricted statistical inference for scale parameters based on sequential order statistics. J Stat Plan Inference 139:2963–2969

    MathSciNet  MATH  Google Scholar 

  • Bickel PJ, Doksum KA (2001) Mathematical statistics: basic ideas and selected topics, vol 1, 2nd edn. Prentice Hall, Upper Saddle River

    MATH  Google Scholar 

  • Burkschat M (2009) Systems with failure-dependent lifetimes of components. J Appl Probab 46:1052–1072

    MathSciNet  MATH  Google Scholar 

  • Burkschat M, Navarro J (2013) Dynamic signatures of coherent systems based on sequential order statistics. J Appl Probab 50:272–287

    MathSciNet  MATH  Google Scholar 

  • Cacoullos T (1965) Comparing Mahalanobis distances I: comparing distances between \(k\) known normal populations and another unknown. Sankhyā: Indian J Stat Ser A 27:1–22

    MathSciNet  MATH  Google Scholar 

  • Cacoullos T, Koutras M (1985) Minimum-distance discimination for spherical distributions. In: Matusita K (ed) Statistical Theory and Data Analysis. North-Holland, Amsterdam, pp 91–102

    Google Scholar 

  • Cacoullos T, Koutras M (1997) On the performance of minimum-distance classification rules for Kotz-type elliptical distributions. In: Advances in the theory and practice of statistics: a volume in Honour of Samuel Kotz, pp 209–224

  • Chen J, Rubin H (1986) Bounds for the difference between median and mean of gamma and Poisson distributions. Stat Probab Lett 4(6):281–283

    MathSciNet  MATH  Google Scholar 

  • Cox DR (1962) Renewal theory. Wiley, New York

    MATH  Google Scholar 

  • Cramer E, Kamps U (1996) Sequential order statistics and \(k\)-out-of-\(n\) systems with sequentially adjusted failure rates. Ann Inst Stat Math 48(3):535–549

    MathSciNet  MATH  Google Scholar 

  • Cramer E, Kamps U (2001) Sequential \(k\)-out-of-\(n\) systems. In: Balakrishnan N, Rao CR (eds) Advances in reliability, handbook of statistics, vol 20. Elsevier, Amsterdam, pp 301–372

    MATH  Google Scholar 

  • Cuadras CM, Fortiana J, Oliva F (1997) The proximity of an individual to a population with applications in discriminant analysis. J Classif 14(1):117–136

    MathSciNet  MATH  Google Scholar 

  • Das Gupta S (1973) Theories and methods in classification: a review. In: Cacoullos T (ed) Discriminant analysis and applications. Academic Press, New York, pp 77–137

    Google Scholar 

  • Huzurbazar VS (1955) Exact forms of some invariants for distributions admitting sufficient statistics. Biometrika 42(3):533–537

    MathSciNet  MATH  Google Scholar 

  • Kailath T (1967) The divergence and Bhattacharyya distance measures in signal selection. IEEE Trans Commun Technol 15(1):52–60

    Google Scholar 

  • Kamps U (1995) A concept of generalized order statistics. J Stat Plan Inference 48(1):1–23

    MathSciNet  MATH  Google Scholar 

  • Kamps U (2016) Generalized Order Statistics. In: Balakrishnan N, Brandimarte P, Everitt B, Molenberghs G, Piegorsch W, Ruggeri F (eds) Wiley statsref: statistics reference online. Wiley, Chichester, pp 1–12

    Google Scholar 

  • Katzur A (2015) Classification and discrimination in models for ordered data. PhD Thesis, RWTH Aachen University, Germany

  • Katzur A, Kamps U (2016) Classification into Kullback–Leibler balls in exponential families. J Multivar Anal 150:75–90

    MathSciNet  MATH  Google Scholar 

  • Koutras M (1992) Minimum distance discrimination rules and success rates for elliptical normal mixtures. Stat Probab Lett 13(4):259–268

    MathSciNet  MATH  Google Scholar 

  • Kullback S (1959) Information theory and statistics. Wiley, New York

    MATH  Google Scholar 

  • Kupperman M (1957) Further applications of information theory to multivariate analysis and statistical inference. PhD thesis, Graduate Council of George Washington University

  • Kupperman M (1958) Probabilities of hypotheses and information-statistics in sampling from exponential-class populations. Ann Math Stat 29(2):571–575

    MathSciNet  MATH  Google Scholar 

  • Matusita K (1966) A distance and related statistics in multivariate analysis. In: Krishnaiah PR (ed) Multivariate analysis. Academic Press, New York, pp 187–200

    MATH  Google Scholar 

  • Matusita K (1971) Some properties of affinity and applications. Ann Inst Stat Math 23(1):137–155

    MathSciNet  MATH  Google Scholar 

  • Matusita K (1973) Discrimination and the affinity of distributions. In: Cacoullos T (ed) Discriminant analysis and applications. Academic Press, New York, pp 213–223

    Google Scholar 

  • McLachlan GJ (1992) Discriminant analysis and statistical pattern recognition. Wiley, New York

    MATH  Google Scholar 

  • Mitrinovic DS, Vasić PM (1970) Analytic inequalities. Springer, Berlin

    Google Scholar 

  • Navarro J, Burkschat M (2011) Coherent systems based on sequential order statistics. Naval Res Logist 58:123–135

    MathSciNet  MATH  Google Scholar 

  • Pardo L (2006) Statistical inference based on divergence measures. Chapman & Hall/CRC, Boca Raton

    MATH  Google Scholar 

  • Rényi A (1961) On measures of entropy and information. In: Proceedings 4th Berkeley symposium on mathematical statististics and probability, pp 547–561

  • Ross SM (2011) Introduction to probability models, 10th edn. Academic Press, San Diego

    Google Scholar 

  • Salicrú M, Morales D, Menéndez ML, Pardo L (1994) On the applications of divergence type measures in testing statistical hypotheses. J Multivar Anal 51(2):372–391

    MathSciNet  MATH  Google Scholar 

  • Scheuer EM (1988) Reliability of an \(m\)-out-of-\(n\) system when component failure induces higher failure rates in survivors. IEEE Trans Reliab 37(1):73–74

    MATH  Google Scholar 

  • Shaked M, Shanthikumar J (2007) Stochastic orders. Springer, New York

    MATH  Google Scholar 

  • Shao J (2007) Mathematical statistics, 2nd edn. Springer, New York (Corr. printing as of 4th printing)

    Google Scholar 

  • Van Belle G, Ahmad IA (1974) Measuring affinity of distributions. In: Proschan F, Serfling RJ (eds) Reliability and biometry: statistical analysis of lifelength. SIAM, Philadelphia, pp 651–668

    Google Scholar 

  • Vuong QN, Bedbur S, Kamps U (2013) Distances between models of generalized order statistics. J Multivar Anal 118:24–36

    MathSciNet  MATH  Google Scholar 

  • Welch BL (1939) Note on discriminant functions. Biometrika 31:218–220

    MathSciNet  MATH  Google Scholar 

  • Zhuang W, Hu T (2007) Multivariate stochastic comparisons of sequential order statistics. Probab Eng Inf Sci 21:47–66

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The research of the first author was supported by an RFwN grant of the RWTH Aachen University. The authors would like to express their sincere thanks to Hans–Hermann Bock and an associate editor for their valuable comments and suggestions as well as to the reviewers for their careful reading and their helpful comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Udo Kamps.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Hypoexponential Distribution

Hypoexponential Distribution

In the literature, the distribution of a sum of independent exponentially distributed random variables is considered in a variety of applications. The name hypoexponential distribution is used, e.g., in Ross (2011).

Definition A.1

(Hypoexponential Distribution) The hypoexponential distribution with shape parameters \(s_1, \dots , s_m \in {\mathbb {N}}\) and pairwise distinct rate parameters \(\beta _1, \dots , \beta _m > 0\) (notation: \(HExp(s_1, \dots , s_m, \beta _1, \dots , \beta _m ) \)) has Lebesgue density function:

and distribution function:

$$\begin{aligned} F(x) = {\left\{ \begin{array}{ll} 0, \quad x \le 0 \\ 1 - \prod \nolimits _{j=1}^m \beta _j^{s_j} \sum \limits _{k=1}^m \sum \limits _{l=1}^{s_k} \frac{ \varPhi _{k,l}( - \beta _k ) }{ ( l - 1 )! ~ \beta _k^{ s_k - l + 1 } } \sum \nolimits _{ j = 0 }^{ s_k - l } \frac{ \exp ( - \beta _k x ) (\beta _k x) ^j }{ j! }, \quad x>0 \end{array}\right. } \end{aligned}$$

where \(\varPhi _{k,l} (t) := \frac{ d^{ l-1 } }{ dt^{ l-1 } } \underset{ j \ne k }{ \prod \limits _{ j=1 } ^ m } ( \beta _j + t )^{ -s_j }\).

Theorem A.6

Let \(X_1, \dots , X_m\) be independent Erlang distributed random variables with shape parameter \(s_i \in {\mathbb {N}}\), \(1 \le i \le m\), and pairwise distinct rate parameters \(\beta _1 , \dots , \beta _m\), respectively. Then the sum \( \mathcal {X} := \sum \nolimits _{j=1}^m X_i\) follows a hypoexponential distribution with shape parameters \(s_1, \dots , s_m \in {\mathbb {N}}\) and rate parameters \(\beta _1, \dots , \beta _m > 0\).

Proof

See Scheuer (1988). \(\square \)

Lemma A.8

  1. (i)

    The cdf of an hypoexponential distribution with shape parameters \(s_1, \dots , s_m \in {\mathbb {N}}\) and pairwise distinct rate parameters \(\beta _1, \dots , \beta _m > 0\) is given by:

    $$\begin{aligned} F(x) = {\left\{ \begin{array}{ll} 0, \quad x \le 0 \\ 1 - \prod \nolimits _{j=1}^m \beta _j^{s_j} \sum \nolimits _{k=1}^m \sum \nolimits _{l=1}^{s_k} \frac{ \varPsi _{k,l}( - \beta _k ) x^{s_k - l } }{ (s_k - l)! ( l - 1 )! } \exp ( - \beta _k x ), \quad x>0 \end{array}\right. } \end{aligned}$$

    where \(\varPsi _{k,l} ( t ) = - \frac{ d^{l-1} }{ dt^{l-1} } \underset{ j \ne k }{ \prod \nolimits _{j=0}^m } ( \beta _j + t ) ^{ -s_j } \), \(s_0 := 1\) and \(\beta _0 := 0\).

  2. (ii)

    \(\varPhi _{k,l} ( t ) = ( -1 )^{l-1} ( l-1 )! \sum \nolimits _{(i_1, \dots , i_m) \in \varOmega _{\varPhi ,k}} \underset{ j \ne k }{\prod \nolimits _{j=1}^m } \left( {\begin{array}{c} i_j + s_j - 1 \\ i_j \end{array}}\right) \tau _j\)

    where \(\tau _j := ( \beta _j + t )^{ -( s_j + i_j ) }\) and \(\varOmega _{\varPhi ,k} := \bigg \{ \varvec{i} \in {\mathbb {N}}_0^m:\underset{ j \ne k }{ \sum \nolimits _{j = 1} ^m } i_j = l - 1,~ i_k = 0\bigg \}\).

  3. (iii)

    \(\varPsi _{k,l} ( t ) = ( -1 )^{l-1} ( l-1 )! \sum \nolimits _{(i_1, \dots , i_m) \in \varOmega _{\varPsi ,k }} \underset{ j \ne k }{\prod \nolimits _{j=0}^m} \left( {\begin{array}{c} i_j + s_j - 1 \\ i_j \end{array}}\right) \tau _j\)

    where \(\tau _j := ( \beta _j + t )^{ -( s_j + i_j ) }\) and \(\varOmega _{\varPsi ,k} := \bigg \{ \varvec{i} \in {\mathbb {N}}_0^m:\underset{ j \ne k }{ \sum \nolimits _{j = 0} ^m } i_j = l - 1,~ i_k = 0\bigg \}\).

Proof

See Amari and Misra (1997). \(\square \)

Lemma A.9

Let \(\mathcal {Z}^+\) and \(\mathcal {Z}^-\) be independent hypoexponential random variables with shape parameters \(s_1^+, \dots , s_m^+\) and \(s_1^-, \dots , s_n^-\) and rate parameters \(\beta _1^+, \dots , \beta _m^+\) and \(\beta _1^-, \dots , \beta _n^-\), respectively. Then the pdf of \(\mathcal {Z}^+ - \mathcal {Z}^-\) is given by

$$\begin{aligned}&f^{\mathcal {Z}^+ - \mathcal {Z}^-} ( t ) = {\left\{ \begin{array}{ll} B^+ B^- \sum \limits _{i = 1}^m \sum \limits _{ j = 1 }^{ s_i^+ } \sum \limits _{k = 1}^n \sum \limits _{ l = 1 }^{ s_k^- } \frac{ \varPhi ^+_{i,j}( - \beta _i^+ ) \varPhi ^-_{k,l}( - \beta _k^- ) }{ ( j - 1 )! ( l - 1 )! } e^{ - \beta _i^+ t }\\ \times \sum \limits _{ r = 0 } ^{ s_i^+ - j } \left( {\begin{array}{c} s_k^- - l + r \\ r \end{array}}\right) \frac{ 1 }{ ( s_i^+ - j - r )! } t^{ s_i ^+ - j - r } ( \beta _i^+ + \beta _k^- )^{ l - s_k^- - r - 1 } , ~&{}t \ge 0 \\ B^+ B^- \sum \limits _{i = 1}^m \sum \limits _{ j = 1 }^{ s_i^+ } \sum \limits _{k = 1}^n \sum \limits _{ l = 1 }^{ s_k^- } \frac{ \varPhi ^+_{i,j}( - \beta _i^+ ) \varPhi ^-_{k,l}( - \beta _k^- ) }{ ( s_i^+ - j )! ( j - 1 )! ( s_k^- - l )! ( l - 1 )! } e^{ \beta _k^- t }\\ \times \sum \limits _{ r = 0 } ^{ s_i^+ - j } \left( {\begin{array}{c} s_i^+ - j \\ r \end{array}}\right) t^{ s_i ^+ - j - r } \\ \times \sum \limits _{ p = 0 }^{ s_k^- - l + r } \frac{ ( s_k^- - l + r )! }{ ( s_k^- - l + r - p )! } ( \beta _i^+ + \beta _k^- )^{ - ( p + 1 ) } ( - t ) ^ {s_k^- - l + r - p} ,~&{} t < 0 \\ \end{array}\right. } \end{aligned}$$

where \(B^+ = \prod \nolimits _{j=1}^m { \left( \beta _j^+ \right) }^{s_j^+}\) and \(B^- = \prod \nolimits _{j=1}^n { \left( \beta _j^- \right) }^{s_j^-}\).

Proof

See Katzur (2015), lemma A.9. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Katzur, A., Kamps, U. Classification using sequential order statistics. Adv Data Anal Classif 14, 201–230 (2020). https://doi.org/10.1007/s11634-019-00368-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11634-019-00368-5

Keywords

Mathematics Subject Classification

Navigation