Unified generalized iterative scaling and its applications
Introduction
Let and be two probability vectors in . The -divergence of with respect to (also known as the Kullback–Liebler information number, cross entropy and information for discrimination) is defined by For any given , it is often of interest to find in some set such that The above minimization problem is usually called the -projection problem, for onto , and the associated solution is called the -projection of onto . It has long been known that -projection plays a key role in the information theoretic approach to statistics (Kullback, 1959, Good, 1963, Bishop et al., 1975).
In some applications, is usually assumed to be of the form for some given vector , matrix and convex cone . This form often occurs in statistics. For , Deming and Stephan (1940) formally introduced the iterative proportional fitting procedure (IPFP) to adjust cell frequencies of contingency tables when all elements of are equal to 0 or 1. Ireland and Kullback (1968) showed the convergence of IPFP to the -projection when marginals of contingency tables are given. Darroch and Ratcliff (1972) proposed the generalized iterative scaling (GIS) to obtain the -projection for general and established the relations between maximum likelihood estimation for log-linear models and -projections. Dykstra and Lemke (1988) demonstrated that the maximum likelihood estimation for discrete distributions has close relationships with -projections onto for some and . Dykstra (1985) proposed an iterative procedure for obtaining -projections onto the intersection of convex sets, and Dykstra and Wollan (1987) devised a computer program based on the iterative procedure. Winkler (1990), Bhattacharya and Dykstra, 1995, Bhattacharya and Dykstra, 1997 and Kuroda and Geng (1999) considered similar problems.
Kullback (1968), Csiszar, 1975, Csiszar, 1989, Haberman (1984) and Ruschendorf and Thomsen (1993) considered -projection problems for probability measures. Ruschendorf (1995) demonstrated that the IPFP for probability also converges to the -projection with given marginals. Bhattacharya (2006) considered an iterative procedure for probability measures to obtain -projections onto the intersection of convex sets.
Gao and Shi (2003) considered the -projection problem when consists of some inequality constraints. They proposed an iterative algorithm for finding the solutions and proved that the proposed algorithm converges to an -projection. The algorithm partly generalizes GIS. They also established the relationship between -projections and the maximum likelihood estimation for log-linear models with ordered parameters.
Analysis of ordinal data is a challenging problem. One of the popular models for ordinal data is the log-linear model, whose parameters are often restricted by some ordering such as odds ratios increasing with ordinal categories (Agresti and Coull, 2002). The aim of this paper is to provide new algorithms for analyzing ordinal data. An iterative algorithm is proposed for computing -projections for when and is the Fenchel dual cone of an isotonic cone . The algorithm reduces to the famous GIS when is chosen. The new method is called the unified generalized iterative scaling (UGIS). Relationships between -projections and maximum likelihood estimations of restricted parameters for log-linear models are demonstrated. This paper is organized as follows. In Section 2, an -projection problem is considered and relations between -projections and log-linear models are given. The UGIS is introduced and the related algorithms are proposed in Section 3. The relationships between UGIS and maximum likelihood estimation of constrained parameters for log-linear models are established. Poisson regression modeling and marginal stochastic order are used to demonstrate the proposed algorithms in Section 4.
Section snippets
-projections and log-linear models
In this section, we will describe the relation between the -projection problem on the Fenchel dual cone of an isotonic cone and log-linear models with restricted ordered parameters. For this purpose, we begin with some necessary definitions. A binary relation on a finite set {, , …, } is a quasi-order if it is reflexive (i.e., , {, , …, }) and transitive (i.e., and imply , , , {, , …, }) only.
Definition 2.1 Let be a quasi-order defined on . Then,
Unified generalized iterative scaling algorithms
Unified generalized iterative scaling (UGIS) is a method of finding the optimal solution of (3). In this section, we propose UGIS algorithms for the kinds of (3). We prove that the proposed algorithms will lead to convergent optimal solutions to the corresponding -projections. According to Lemma 2.2, without loss of generality, suppose that the matrix of given in (3) is and for . For any and weight , denote the projection of onto by
Examples
The following examples were chosen to demonstrate the applications of the proposed algorithms for log-linear modeling. The first example concerns a Poisson regression model with the regression coefficient being restricted to an isotonic cone. In the second example, we illustrate how the problem of marginal stochastic ordering in a square contingency table is transformed into an -projection problem on (14).
Poisson regression modeling. Suppose that the given the covariates () are
Acknowledgements
The authors thank the associate editor and two anonymous reviewers for helpful comments and suggestions on an earlier version of this article. This work was supported by NSFC:10701021, NSFC:10931002, NSFC:10828102 and NENU-STC07001. M.L. Tang’s research was fully supported by a grant from the Research Grant Council of the Hong Kong Special Administrative Region (Project No. KBU261508) and the Hong Kong Baptist university Grant FRG2/08-09/066.
References (28)
- et al.
The analysis of contingency tables under inequality constraints
J. Statist. Plann. Inference
(2002) - et al.
A general duality approach to -projections
J. Statist. Plann. Inference
(1995) - et al.
Note on the Schrodinger equation and -projections
Statist. Probab. Lett.
(1993) Categorical Data Analysis
(2002)- et al.
A Fenchel duality aspect of iterative -projection procedures
Ann. Inst. Statist. Math.
(1997) An iterative procedure for general probability measures to obtain -projection onto intersection of convex sets
Ann. Statist.
(2006)- et al.
Discrete Multivariate Analysis: Theory and Practice
(1975) Information-type measures of difference of probability distributions and indirect observations
Studia Sci. Math. Hungar.
(1967)-divergence geometry of probability distributions and minimization problems
Ann. Probab.
(1975)A geometric interpretation of Darroch and Ratcliff’s generalized iterative scaling
Ann. Statist.
(1989)