A zero-inflated Poisson mixed model to analyze diagnosis related groups with majority of same-day hospital stays

https://doi.org/10.1016/S0169-2607(01)00171-7Get rights and content

Abstract

With increasing trend of same-day procedures and operations performed for hospital admissions, it is important to analyze those Diagnosis Related Groups (DRGs) consisting of mainly same-day separations. A zero-inflated Poisson (ZIP) mixed model is presented to identify health- and patient-related characteristics associated with length of stay (LOS) and to model variations in LOS within such DRGs. Random effects are introduced to account for inter-hospital variations and the dependence of clustered LOS observations via the generalized linear mixed models (GLMM) approach. Parameter estimation is achieved by maximizing an appropriate log-likelihood function using the EM algorithm to obtain approximate residual maximum likelihood (REML) estimates. An S-Plus macro is developed to provide a unified ZIP modeling approach. The determination of pertinent factors would benefit hospital administrators and clinicians to manage LOS and expenditures efficiently.

Introduction

Like many countries, the number of patients seeking treatment in Australian hospitals continues to rise, with 5.7 million episodes of admitted patient care recorded in 1998/99, up 3.1% on the previous year. The increase in patient throughput compensates for the continuing decline in the average length of stay (ALOS): from 4.3 days in 1993/94 to 3.9 days in 1998/99 [1]. According to findings published in the Australian Hospital Statistics, a major contribution to the shorter ALOS was an increased number of admitted patients being treated on a same-day basis, that is, admitted and separated on the same date. Indeed, the proportion of same-day separations has almost doubled in the 10 years from 25% in 1989/90 to 48% in 1998/99. However, for patients staying at least one night, ALOS has fallen more slowly over recent years.

Advances in medical and communication technologies and clinical practice have generally enabled health services to be provided more effectively and to improve outcomes for patients—earlier diagnosis, less pain, more care in the community and faster recovery times. In particular, increases in non-invasive procedures, improved diagnostic technology and improved anaesthetics and drugs have contributed to the decline in length of hospital stay and an increasing proportion of same-day services. The focus of this paper is therefore on Diagnosis Related Groups (DRGs) which comprised mainly same-day separations.

DRGs are a patient classification scheme which provides a clinically meaningful way of relating the number and type of patients treated in a hospital (i.e. its casemix) to the resources required by the hospital. The classification categories acute admitted patient episodes of care into groups with similar conditions and similar usage of hospital resources, using information in the hospital morbidity record such as the diagnoses, procedures and demographic characteristics of the patients.

Inpatient length of stay (LOS) is often used as an indicator of hospital efficiency. It is also considered to be a reasonable proxy of resource consumption. But the heterogeneity of LOS within DRGs poses a problem for statistical analysis. For example, Marazzi et al. [2] assessed the adequacy of conventional parametric models for describing the LOS distribution but none appeared to fit satisfactorily across a variety of samples. Further, the statistical significance of LOS differences (e.g. for assessing interventions) can be meaningless if the underlying distribution is neglected [3]. A finite Poisson mixture distribution appears to be a suitable alternative to account for the heterogeneity of LOS [4].

Section snippets

Motivation of study

A limitation of the Poisson mixture regression model [4] is that LOS data collected from the same hospital are often correlated. The dependence of clustered data (patients nested within hospitals) can lead to imprecision of coefficient estimates which directly affects the statistical significance of risk factors. Ignoring such intra-cluster correlations may result in overlooking the importance of certain cluster effects and call into question the validity of traditional statistical techniques

Zero-inflated Poisson mixed regression model

Zero-Inflated Poisson (ZIP) regression is a model for count data with excess zeros. Consider a discrete random variable Y with ZIP distribution [6]:PrY=0=φ+1−φe−θPrY=y=1−φe−θ(θ)yy!y = 1,2,,where 0<φ<1 so that it incorporates more zeros than those allowed by the Poisson. A graphical representation of this distribution is given by Böhning et al. [7]. The ZIP distribution may also be regarded as a mixture of a Poisson (θ) and a degenerate component putting all its mass at zero. A plausible

An EM algorithm for estimation

Instead of using a Newton-Raphson type algorithm for parameter estimation, an EM algorithm is proposed to ensure convergence. The complete-data log-likelihood is constructed aslC=lξ+lη,wherelξ=∑ijzijξijlog(1+expij)12mlog(2πσu2)+1σu2u′ulη=∑ij(1−zij)yijηijexpij)−log(yij!)12mlog(2πσv2)+1σv2v′v,and zij is an unobserved binary variable indicating whether yij comes from the latent class zero (zij=1) or non-zero (zij=0). Treating the realization of occurrence of the extra zeros as a missing

Data source

Australian National Diagnosis Related Groups (AN-DRGs), the Australian patient classification system, was adapted from the United States DRGs to reflect Australian clinical standards and practice. It is based on a description of body systems, a partition into medical, surgical and other groupings, and a hierarchy of procedures, medical problems and other factors that differentiate processes of care. The classification is partly hierarchical, with 23 Major Diagnostic Categories (MDCs) into which

Discussion

The trend of increasing same-day hospital separations will inevitably place increased demands on families and carers at a time when families tend to be smaller and women are increasingly in the workforce. This will require health care reforms to recognize the need for community resources for post-acute care. In summary, developing appropriate risk-adjusted models for health care outcome such as inpatient LOS is statistically complex but essential for understanding variation. This study has

Acknowledgements

The authors are grateful to the Health Information Centre, Health Department of Western Australia, for provision of the hospital length of stay data. The computer program (S-Plus macro) is available from the second author's web page: http://fbstaff.cityu.edu.hk/mskyau/. The authors would like to thank the referee for helpful comments. This research is supported in part by grants from Curtin University and the Research Grants Council of Hong Kong.

References (19)

  • J. Xiao et al.

    Mixture distribution analysis of length of hospital stay for efficient funding

    Socio-Econ. Plann. Sci.

    (1999)
  • E. Dietz et al.

    On estimation of the Poisson parameter in zero-modified Poisson models

    Comput. Stat. Data Anal.

    (2000)
  • Australian Institute of Health & Welfare, Australian Hospital Statistics 1998–99, AIHW Cat. No. HSE-11,...
  • A. Marazzi et al.

    Fitting the distributions of length of stay by parametric models

    Med. Care

    (1998)
  • A.M. Bernard et al.

    The integrated inpatient management model: lessons from managed care

    Med. Care

    (1995)
  • C. Beaver et al.

    Casemix-based funding of Northern Territory public hospitals: adjusting for severity and socio-economic variations

    Health Econ.

    (1998)
  • N.L. Johnson, S. Kotz, A.W. Kemp, Univariate Discrete Distribution, second ed., Wiley, New York,...
  • D. Böhning et al.

    The zero-inflated Poisson model and the decayed, missing and filled teeth index in dental epidemiology

    J. R. Stat. Soc. A

    (1999)
  • D. Lambert

    Zero-inflated Poisson regression, with an application to defects in manufacturing

    Technometrics

    (1992)
There are more references available in the full text version of this article.

Cited by (47)

  • Weighted Score test based EWMA control charts for Zero-Inflated Poisson Models

    2021, Computers and Industrial Engineering
    Citation Excerpt :

    And they gave an example from horticulture to illustrate this mixed model. When analyzing the diagnosis-related group with a majority of same-day hospital stays (zeros), a popular approach to modeling inpatient length of stay (LOS) is the mixed model, which can not only identify health- and patient-related characteristics but also account for inter-hospital variation, as discussed by (Wang, Yau, & Lee, 2002). However, prior to applying the mixed model, it is quite significant to test whether the assumptions of the model, such as risk factors (covariates), zero inflation and dependency, are necessary in order to avoid drawing false conclusions.

  • Modeling repeated count measures with excess zeros in an epidemiological study

    2015, Annals of Epidemiology
    Citation Excerpt :

    Yau and Lee [8] used the expectation-maximization algorithm to incorporate two distinct random effects for the Poisson and logistic parts with separate parameter estimation for each part. An additional application can be found in Wang et al. [9], who described a ZIP regression model with cluster-specific random effects. We applied a longitudinal ZIP mixed effects model (hereafter ZIP-mixed) in the analysis of repeated assessments of problems encountered with using the female condom.

  • Score tests for zero-inflation and overdispersion in two-level count data

    2013, Computational Statistics and Data Analysis
    Citation Excerpt :

    Dependency among responses can be explained by hierarchical structures through the use of random effects. Hall (2000), Yau and Lee (2001), Hur et al. (2002), and Wang et al. (2002) consider ZIP regression models with cluster-specific random effects to address the heterogeneous variances among clusters. Xiang et al. (2006) propose a score test for zero-inflation in correlated count data, and Lee et al. (2006) extend the ZIP regression model to a multilevel ZIP regression model with random effects.

  • Gender and gambling preference

    2024, Applied Economics
View all citing articles on Scopus
View full text