Maximum likelihood estimation for incomplete multinomial data via the weaver algorithm

Dong, Fanghu; Yin, Guosheng

doi:10.1007/s11222-017-9782-2

Maximum likelihood estimation for incomplete multinomial data via the weaver algorithm

Published: 27 October 2017

Volume 28, pages 1095–1117, (2018)
Cite this article

Statistics and Computing Aims and scope Submit manuscript

486 Accesses
5 Citations
Explore all metrics

Abstract

In a multinomial model, the sample space is partitioned into a disjoint union of cells. The partition is usually immutable during sampling of the cell counts. In this paper, we extend the multinomial model to the incomplete multinomial model by relaxing the constant partition assumption to allow the cells to be variable and the counts collected from non-disjoint cells to be modeled in an integrated manner for inference on the common underlying probability. The incomplete multinomial likelihood is parameterized by the complete-cell probabilities from the most refined partition. Its sufficient statistics include the variable-cell formation observed as an indicator matrix and all cell counts. With externally imposed structures on the cell formation process, it reduces to special models including the Bradley–Terry model, the Plackett–Luce model, etc. Since the conventional method, which solves for the zeros of the score functions, is unfruitful, we develop a new approach to establishing a simpler set of estimating equations to obtain the maximum likelihood estimate (MLE), which seeks the simultaneous maximization of all multiplicative components of the likelihood by fitting each component into an inequality. As a consequence, our estimation amounts to solving a system of the equality attainment conditions to the inequalities. The resultant MLE equations are simple and immediately invite a fixed-point iteration algorithm for solution, which is referred to as the weaver algorithm. The weaver algorithm is short and amenable to parallel implementation. We also derive the asymptotic covariance of the MLE, verify main results with simulations, and compare the weaver algorithm with an MM/EM algorithm based on fitting a Plackett–Luce model to a benchmark data set.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A New Two-Parameter Estimator for the Poisson Regression Model

Article 13 March 2017

Novel EM based ML Kalman estimation framework for superresolution of stochastic three-states microtubule signal

Article Open access 22 November 2018

Model based clustering of multinomial count data

Article Open access 05 July 2023

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

References

Agresti, A.: Categorical Data Analysis, 2nd edn. Wiley, New York (2003)
MATH Google Scholar
Bradley, R.A., Terry, M.E.: Rank analysis of incomplete block designs: I. The method of paired comparisons. Biometrika 39(3/4), 324–345 (1952)
Article MathSciNet MATH Google Scholar
Caron, F., Doucet, A.: Efficient Bayesian inference for generalized Bradley–Terry models. J. Comput. Graph. Stat. 21(1), 174–196 (2012)
Article MathSciNet Google Scholar
Chen, T., Fienberg, S.E.: The analysis of contingency tables with incompletely classified data. Biometrics 32(1), 133–144 (1976)
Article MathSciNet MATH Google Scholar
Connor, R.J., Mosimann, J.E.: Concepts of independence for proportions with a generalization of the Dirichlet distribution. J. Am. Stat. Assoc. 64(325), 194–206 (1969)
Article MathSciNet MATH Google Scholar
Cox, D.A., Little, J., O’Shea, D.: Ideals, Varieties, and Algorithm: An Introduction to Computational Algebraic Geometry and Commutative Algebra, 3rd edn. Springer, New York (2007)
Book MATH Google Scholar
David, H.A.: The Method of Paired Comparisons, 2nd edn. Oxford University Press, Oxford (1988)
MATH Google Scholar
Davidson, R., Farquhar, P.: A bibliography on the method of paired comparisons. Biometrics 32, 241–252 (1976)
MathSciNet MATH Google Scholar
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B (Methodol.) 39(1), 1–38 (1977)
MathSciNet MATH Google Scholar
Diaconis, P.: In: Gupta, S.S. (ed.) Group Representations in Probability and Statistics, Lecture Notes-Monograph Series, vol. 11. Institute of Mathematical Statistics Hayward, CA. https://projecteuclid.org/euclid.lnms/1215467407 (1988)
Dickey, J.M., Jiang, J.M., Kadane, J.B.: Bayesian methods for censored categorical data. J. Am. Stat. Assoc. 82(399), 773–781 (1987)
Article MathSciNet MATH Google Scholar
Dwork, C., Kumar, R., Naor, M., Sivakumar, D.: Rank aggregation methods for the web. In: Proceedings of the 10th International Conference on World Wide Web, pp. 613–622. ACM (2001)
Ford, L.R.J.: Solution of a ranking problem from binary comparisons. Am. Math. Mon. 64(8), 28–33 (1957)
Article MathSciNet MATH Google Scholar
Gordon, L.: Successive sampling in large finite populations. Ann. Stat. 11(2), 702–706 (1983)
Article MathSciNet MATH Google Scholar
Gormley, I.C., Murphy, T.B.: Exploring voting blocs within the irish electorate: a mixture modeling approach. J. Am. Stat. Assoc. 103(483), 1014–1027 (2008)
Article MathSciNet MATH Google Scholar
Guiver, J., Snelson, E.: Bayesian inference for Plackett-Luce ranking models. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 377–384. ACM, Pittsburgh (2009)
Haberman, S.J.: Product models for frequency tables involving indirect observation. Ann. Stat. 5(6), 1124–1147 (1977)
Article MathSciNet MATH Google Scholar
Hankin, R.K.S.: A generalization of the Dirichlet distribution. J. Stat. Softw. 33(11), 1–18 (2010)
Article Google Scholar
Hartley, H.O., Hocking, R.R.: The analysis of incomplete data. Biometrics 27(4), 783–823 (1971)
Article Google Scholar
Hastie, T., Tibshirani, R.: Classification by pairwise coupling. Ann. Stat. 26(2), 451–471 (1998)
Article MathSciNet MATH Google Scholar
Heiser, W.J.: Convergent computing by iterative majorization: theory and applications in multidimensional data analysis. In: Krzanowski, W.J. (ed.) Recent Advances in Descriptive Multivariate Analysis, pp. 157–189. Clarendon Press, Oxford (1995)
Google Scholar
Huang, T.K., Weng, R.C., Lin, C.J.: Generalized Bradley–Terry models and multi-class probability estimates. J. Mach. Learn. Res. 7, 85–115 (2006)
MathSciNet MATH Google Scholar
Hunter, D.R.: MM algorithms for generalized Bradley–Terry models. Ann. Stat. 32(1), 384–406 (2004)
Article MathSciNet MATH Google Scholar
Hunter, D.R., Lange, K.: A tutorial on MM algorithms. Am. Stat. 58(1), 30–37 (2004)
Article MathSciNet Google Scholar
Jech, T.: The ranking of incomplete tournaments: a mathematician’s guide to popular sports. Am. Math. Mon. 90(4), 246–266 (1983)
Article MathSciNet MATH Google Scholar
Kernighan, B.W., Ritchie, D.M.: In: Ritchie, D.M. (ed.) The C Programming Language, 2nd edn. Prentice Hall Professional Technical Reference, Upper Saddle River (1988)
Lagarias, J., Reeds, J., Wright, M., Wright, P.: Convergence properties of the Nelder–Mead simplex method in low dimensions. SIAM J. Optim. 9(1), 112–147 (1998)
Article MathSciNet MATH Google Scholar
Laird, N.: Nonparametric maximum likelihood estimation of a mixing distribution. J. Am. Stat. Assoc. 73(364), 805–811 (1978)
Article MATH Google Scholar
Lange, K.: Optimization, 2nd edn. Springer, New York (2013)
Book MATH Google Scholar
Lange, K., Zhou, H.: MM algorithms for geometric and signomial programming. Math. Program. 143(1–2), 339–356 (2014)
Article MathSciNet MATH Google Scholar
Lange, K., Hunter, D.R., Yang, I.: Optimization transfer using surrogate objective functions. J. Comput. Graph. Stat. 9(1), 1–59 (2000)
MathSciNet Google Scholar
Loève, M.: Probability Theory I, 4th edn. Springer, New York (1977)
MATH Google Scholar
Loève, M.: Probability Theory II, 4th edn. Springer, New York (1978)
Book MATH Google Scholar
Luce, R.D.: Individual Choice Behavior: A Theoretical Analysis. Wiley, New York (1959)
MATH Google Scholar
Luce, R.D.: The choice axiom after twenty years. J. Math. Psychol. 15, 215–223 (1977)
Article MathSciNet MATH Google Scholar
Marden, J.I.: Analyzing and Modeling Rank Data. Chapman & Hall/CRC, Boca Raton (1996)
MATH Google Scholar
MathWorks: Matlab documentation. URL https://www.mathworks.com/help/matlab/ref/profile.html (2017)
McLachlan, G., Krishnan, T.: The EM Algorithm and Extensions, 2nd edn. Wiley, New York (2008)
Book MATH Google Scholar
Nelder, J.A., Mead, R.: A simplex method for function minimization. Comput. J. 7(4), 308–313 (1965)
Article MathSciNet MATH Google Scholar
Ng, K.W., Tian, G.L., Tang, M.L.: Dirichlet and Related Distributions: Theory, Methods and Applications. Wiley, New York (2011)
Book MATH Google Scholar
NVIDIA: CUDA Toolkit Documentation v8.0. URL http://docs.nvidia.com/cuda/index.html (2017)
Pistone, G., Riccomagno, E., Wynn, H.P.: Algebraic Statistics: Computational Commutative Algebra in Statistics. Chapman & Hall/CRC, Boca Raton (2000)
Book MATH Google Scholar
Plackett, R.L.: The analysis of permutations. Appl. Stat. 24, 193–202 (1975)
Article MathSciNet Google Scholar
Sattath, S., Tversky, A.: Unite and conquer: a multiplicative inequality for choice probabilities. Econometrica 44(1), 79–89 (1976)
Article MathSciNet MATH Google Scholar
Suppes, P., Krantz, D.H., Luce, R.D., Tversky, A.: Foundations of Measurement: Geometrical, Threshold, and Probabilistic Representations. Academic Press, New York (1971)
MATH Google Scholar
Tanner, M.A.: Tools for Statistical Inference: Methods for the Exploration of Posterior Distributions and Likelihood Functions. Springer, New York (1996)
Book MATH Google Scholar
Thurstone, L.L.: Psychophysical analysis. Am. J. Psychol. 38(3), 368–389 (1927)
Article Google Scholar
Turnbull, B.W.: The empirical distribution function with arbitrarily grouped, censored and truncated data. J. R. Stat. Soc. Ser. B (Methodol.) 38(3), 290–295 (1976)
MathSciNet MATH Google Scholar
Tversky, A.: Elimination by aspects: a theory of choice. Psychol. Rev. 79, 281–299 (1972)
Article Google Scholar
Wu, C.F.J.: On the convergence properties of the EM algorithm. Ann. Stat. 11(1), 95–103 (1983)
Article MathSciNet MATH Google Scholar
Yan, T., Yang, Y., Xu, J.: Sparse paired comparisons in the Bradley–Terry model. Statistica Sinica 22(3), 1305–1318 (2012)
Article MathSciNet MATH Google Scholar
Zermelo, E.: Die Berechnung der Turnier-Ergebnisse als ein Maximumproblem der Wahrscheinlichkeitsrechnung. Mathematische Zeitschrift 29(1), 436–460 (1929)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

The authors are grateful to the two referees, Associate Editor, and Editor for their insightful comments that have significantly improved the article. Yin’s research was supported in part by a grant (17326316) from the Research Grants Council of Hong Kong.

Author information

Authors and Affiliations

Department of Statistics and Actuarial Science, The University of Hong Kong, Pokfulam, Hong Kong
Fanghu Dong & Guosheng Yin

Authors

Fanghu Dong
View author publications
Search author on:PubMed Google Scholar
Guosheng Yin
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Fanghu Dong.

Appendices

Appendix A: Proof of Lemma 1

Proof

(Work with $x_{i}/a_{i}$ and connect to the weighted AM–GM inequality, with its equality condition). Rewrite the target inequality as

$$\begin{aligned} \prod \limits _{i=1}^{n}{a_{i}^{{a_{i}}}}{\prod \limits _{i=1}^{n}{\left( {\frac{{x_{i}}}{{a_{i}}}}\right) }^{{a_{i}}}}\leqslant \frac{{\prod \limits _{i=1}^{n}{a_{i}^{{a_{i}}}}}}{{{\left( {\sum \limits _{i=1}^{n}{a_{i}}}\right) }^{\sum \limits _{i=1}^{n}{a_{i}}}}}a_{i}^{\sum \limits _{i=1}^{n}{a_{i}}}{\left( {\sum \limits _{i=1}^{n}{\frac{{x_{i}}}{{a_{i}}}}}\right) ^{\sum \limits _{i=1}^{n}{a_{i}}}}, \end{aligned}$$

By substituting $y_{i}$ for $x_{i}/a_{i}$ and taking the $\left( {\sum \limits _{i=1}^{n}{a_{i}}}\right) $-th root on both sides, we have

$$\begin{aligned} \prod \limits _{i=1}^{n}{y_{i}^{\frac{{a_{i}}}{{\sum \limits _{i=1}^{n}{a_{i}}}}}}\leqslant \sum \limits _{i=1}^{n}{\frac{{a_{i}}}{{\sum \limits _{i=1}^{n}{a_{i}}}}{y_{i}}}. \end{aligned}$$

After a further substitution of $w_{i}=a_{i}/\sum _{i=1}^{n}a_{i}$, we arrive at

$$\begin{aligned} \prod \limits _{i=1}^{n}{y_{i}^{{w_{i}}}}\leqslant \sum \limits _{l=1}^{n}{{w_{i}}{y_{i}}}, \end{aligned}$$

which is the weighted AM-GM inequality. It is crucial that we now check and confirm that all equalities can hold jointly if and only if $x_{i}/a_{i}=\tau $ for all i, given the existence of such a uniform constant $\tau $ which must be positive. $\square $

Appendix B: Examples and Corollaries of Lemma 1

Example 5

$\left( x_{1}+x_{2}\right) ^{5}\geqslant \frac{5^{5}}{3^{3}2^{2}}x_{1}^{3}x_{2}^{2}$. This inequality holds because

$$\begin{aligned} x_{1}^{3}x_{2}^{2}= & {} \frac{{x_{1}}}{3}\frac{{x_{1}}}{3}\frac{{x_{1}}}{3}\frac{{x_{2}}}{2}\frac{{x_{2}}}{2}{3^{3}}{2^{2}} \\\leqslant & {} {3^{3}}{2^{2}}{\left( {\frac{{3\frac{{x_{1}}}{3}+2\frac{{x_{2}}}{2}}}{{3+2}}}\right) ^{3+2}} \\= & {} {3^{3}}{2^{2}}{\left( {\frac{{{x_{1}}+{x_{2}}}}{5}}\right) ^{5}}, \end{aligned}$$

where the equality is attained if and only if $(x_{1},x_{2})$ is colinear with (3, 2).

Example 6

$\left( x_{1}+x_{2}\right) ^{7}x_{3}^{3}x_{4}^{5} \leqslant \frac{{3^{3}}{5^{5}}{7^{7}}}{{15}^{15}} \left( {x_{1}}+{x_{2}}+{x_{3}}+{x_{4}}\right) ^{15}$. This inequality holds because

$$\begin{aligned} \left( x_{1}+x_{2}\right) ^{7}x_{3}^{3}x_{4}^{5} \leqslant {{7^{7}}3^{3}}{5^{5}}{\left( {\frac{7\frac{{{x_{1}}+{x_{2}}}}{7}+{3\frac{{x_{3}}}{3}+5\frac{{x_{4}}}{5}}}{{7+3+5}}}\right) ^{7+3+5}}, \end{aligned}$$

where the equality is attained if and only if $(x_{1}+x_{2},\,x_{3},\,x_{4})$ is colinear with (7, 3, 5). More importantly, together with the inequality in the previous example, the two equalities are jointly attained if and only if $(x_{1},\,x_{2},\,x_{3},\,x_{4})$ is colinear with (21, 14, 15, 25).

Corollary 1

If we require $\sum _{i=1}^{n}{x_{i}}=\sum _{i=1}^{n}{a_{i}}=1$ in Lemma 1, then

$$\begin{aligned}&\prod \limits _{i=1}^{n}{x_{i}^{{a_{i}}}} \leqslant \prod \limits _{i=1}^{n}{a_{i}^{{a_{i}}}},\nonumber \\&\sum \limits _{i=1}^{n}{{a_{i}}\ln {x_{i}}} \leqslant \sum \limits _{i=1}^{n}{{a_{i}}\ln {a_{i}}}, \end{aligned}$$

(18)

and the equalities are attained if and only if $x_{i}=a_{i}$ for $i=1,\ldots ,n$.

Corollary 2

Let $\varvec{x}\in (0,+\infty )^{n}$ be a vector of n positive reals. Let $\varvec{\delta }\in \{0,1\}^{n}$ be a vector of n bits. Let $\varvec{\beta }\in [0,+\infty )^{n}$ be a nonzero vector of n nonnegative reals such that $\beta _{j}=0$ if $\delta _{j}=0$. Let $b=\sum _{i=1}^{n}{\beta _{i}}>0$. Define $0^{0}=1$. Then

$$\begin{aligned} \left( \varvec{\delta }^{\intercal }\varvec{x}\right) ^{b} \geqslant \frac{b^{b}}{\prod \limits _{i=1}^{n}\beta _{i}^{{\beta _{i}}}}\prod \limits _{i=1}^{n}x_{i}^{{\beta _{i}}}, \end{aligned}$$

where the equality is attained if and only if there exists a positive k such that $x_{i}/\beta _{i}=k$ for each of the i’s having $\delta _{i}=1$.

Example 7

Let $n=5$, $\varvec{\delta }=(1,0,1,0,1)^{\intercal }$, $\varvec{\beta }=(3,0,4,0,6)^{\intercal }$, $b=3+0+4+0+6=13$. Then $\forall \varvec{x}\in (0,+\infty )^{n}$, we have

$$\begin{aligned}&(1x_{1}+0x_{2}+1x_{3}+0x_{4}+1x_{5})^{13}\\&\quad \geqslant \frac{13^{13}}{3^{3}0^{0}4^{4}0^{0}6^{6}}x_{1}^{3}x_{2}^{0}x_{3}^{4}x_{4}^{0}x_{5}^{6}, \end{aligned}$$

which attains the equality if and only if $x_{1}:x_{3}:x_{5}=3:4:6$.

Corollary 3

If we rescale each $x_{i}$ by an independent positive constant $c_{i}$, then we have the a seemingly more general but rather equivalent formulation of Lemma 1,

$$\begin{aligned} \prod \limits _{i=1}^{n}{x_{i}^{{a_{i}}}}\leqslant \frac{{\prod \limits _{i=1}^{n}{a_{i}^{{a_{i}}}}}}{{\prod \limits _{i=1}^{n}{c_{i}^{{a_{i}}}}{{\left( {\sum \limits _{i=1}^{n}{a_{i}}}\right) }^{\sum \limits _{i=1}^{n}{a_{i}}}}}}{\left( {\sum \limits _{i=1}^{n}{{c_{i}}{x_{i}}}}\right) ^{\sum \limits _{i=1}^{n}{a_{i}}}}, \end{aligned}$$

which attains the equality if and only if there exists some positive constant k such that ${{c_{i}}{x_{i}}}/{a_{i}}=k$ for all i.

Example 8

Let $n=3$, $a=(1,2,3)$, $c=(4,5,6)$, then we have

$$\begin{aligned}&\left( {4{x_{1}}}\right) {\left( {5{x_{2}}}\right) ^{2}}{\left( {6{x_{3}}}\right) ^{3}}\\&\quad \leqslant {\left( {\frac{{4{x_{1}}+\frac{{5{x_{2}}}}{2}+\frac{{5{x_{2}}}}{2}+\frac{{6{x_{3}}}}{3}+\frac{{6{x_{3}}}}{3}+\frac{{6{x_{3}}}}{3}}}{6}}\right) ^{6}}. \end{aligned}$$

Therefore,

$$\begin{aligned} {x_{1}}x_{2}^{2}x_{3}^{3}\leqslant \frac{1}{{{4^{1}}{5^{2}}{6^{3}}}}\frac{{{1^{1}}{2^{2}}{3^{3}}}}{{6^{6}}}{\left( {4{x_{1}}+5{x_{2}}+6{x_{3}}}\right) ^{6}}, \end{aligned}$$

which attains equality if and only if $4{x_{1}}={5{x_{2}}}/2={6{x_{3}}}/3$ or ${x_{1}}:{x_{2}}:{x_{3}}=5:8:10$.

Corollary 4

Generalizing Corollary 3 to a linear transform $\varvec{U}$ on vector $\varvec{x}$,

$$\begin{aligned} \prod \limits _{i=1}^{n}{{\left( {\varvec{u}_{i}^{\intercal }\varvec{x}}\right) }^{{a_{i}}}}\le \left\{ {\prod \limits _{i=1}^{n}{{\left( {\frac{{a_{i}}}{{\theta _{i}}}}\right) }^{{a_{i}}}}}\right\} {\left( {\frac{{{\varvec{\theta }^{\intercal }}\varvec{U}\varvec{x}}}{{\sum \limits _{i=1}^{n}{a_{i}}}}}\right) ^{\sum \limits _{i=1}^{n}{a_{i}}}}, \end{aligned}$$

which attains the equality if and only if

$$\begin{aligned} \left[ {\begin{array}{ccc} {\frac{{\theta _{1}}}{{a_{1}}}} &{} &{} 0\\ &{} \ddots \\ 0 &{} &{} {\frac{{\theta _{n}}}{{a_{n}}}} \end{array}}\right] \varvec{U}\varvec{x}=k\mathbf {1}_{n}, \end{aligned}$$

where k is a constant and can be solved explicitly under an extra constraint such as an affine constraint on $\varvec{x}$.

Example 9

Let $x_{1}=2y_{1}+y_{2}$ and $x_{2}=y_{1}+2y_{2}$ in the first case of Example 5, we have

$$\begin{aligned} {\left( {2{y_{1}}+{y_{2}}}\right) ^{3}}{\left( {{y_{1}} +2{y_{2}}}\right) ^{2}}\le \frac{{{2^{2}}{3^{8}}}}{{5^{5}}}{\left( {{y_{1}}+{y_{2}}}\right) ^{5}}, \end{aligned}$$

which attains equality if and only if $y_{1}=4y_{2}$. By requiring the constraint $y_{1}+y_{2}=1$ on the solution, it follows

$$\begin{aligned} \left[ {\begin{array}{c} {y_{1}}\\ {y_{2}} \end{array}}\right] =\left[ {\begin{array}{c} {0.8}\\ {0.2} \end{array}}\right] , \end{aligned}$$

and the unique maximum of ${\left( {2{y_{1}}+{y_{2}}}\right) ^{3}} {\left( {{y_{1}}+2{y_{2}}}\right) ^{2}}$ attained is ${{2^{2}}{3^{8}}}/{5^{5}}=8.398$.

We recursively apply the inequality to the objective, as this inequality transforms the maximization problem into a set of equality attainment conditions, which becomes a system of simple equations.

Appendix C: Proof of the ascent property and the linear rate of convergence of the weaver algorithm when s is sufficiently large

We instead maximize the log-likelihood with a Lagrange multiplier term to incorporate the equality constraint,

$$\begin{aligned} \ell (\varvec{p})={\varvec{a}^{\intercal }}\ln \varvec{p}+{\varvec{b}^{\intercal }}\ln {\varvec{\varDelta }^{\intercal }}\varvec{p}-s\left( {{\varvec{1}^{\intercal }}{\varvec{p}}- 1}\right) , \end{aligned}$$

where the Lagrange multiplier is the known constant

$$\begin{aligned} s=\varvec{1}^{\intercal }\varvec{a}+\varvec{1}^{\intercal }\varvec{b}, \end{aligned}$$

not adding an extra unknown.

The derivative of $\ell (\varvec{p})$ with respect to $p_{i}$ at iteration k is given by

$$\begin{aligned} \frac{{\partial \ell (\varvec{p})}}{{\partial p_{i}^{\left( k\right) }}}=\frac{{a_{i}}}{{p_{i}^{\left( k\right) }}}+\sum \limits _{j=1}^{q}{\frac{{{\varDelta _{ij}}{b_{j}}}}{\sum _{h=1}^{d}\varDelta _{hj}p_{h}^{\left( k\right) }}}-s. \end{aligned}$$

Combining the weaver steps 1 and 2, $p_{i}^{(k)}$ is updated according to

$$\begin{aligned} p_{i}^{\left( {k+1}\right) }=\frac{{a_{i}}}{{s-\sum \limits _{j=1}^{q}{\frac{{{\varDelta _{ij}}{b_{j}}}}{\sum _{h=1}^{d}\varDelta _{hj}p_{h}^{\left( k\right) }}}}}. \end{aligned}$$

We seek to establish the positivity of the quantity

$$\begin{aligned} \left( {p_{i}^{\left( {k+1}\right) }-p_{i}^{\left( k\right) }}\right) \frac{{\partial \ell (\varvec{p})}}{{\partial p_{i}^{\left( k\right) }}}= & {} \left\{ {\frac{{a_{i}}}{{p_{i}^{\left( k\right) }}}+\sum \limits _{j=1}^{q}{\frac{{{\varDelta _{ij}}{b_{j}}}}{\sum _{h=1}^{d}\varDelta _{hj}p_{h}^{\left( k\right) }}}-s}\right\} \\&\times \left\{ {\frac{{a_{i}}}{{s-\sum \limits _{j=1}^{q}{\frac{{{\varDelta _{ij}}{b_{j}}}}{\sum _{h=1}^{d}\varDelta _{hj}p_{h}^{\left( k\right) }}}}}-p_{i}^{\left( k\right) }}\right\} \\= & {} \frac{{{\left( {{a_{i}}-p_{i}^{\left( k\right) }{v^{\left( k\right) }}}\right) }^{2}}}{{p_{i}^{\left( k\right) }{v^{\left( k\right) }}}}, \end{aligned}$$

where

$$\begin{aligned} {v^{\left( k\right) }}\equiv s-\sum \limits _{j=1}^{q}{\frac{{{\varDelta _{ij}}{b_{j}}}}{\sum _{h=1}^{d}\varDelta _{hj}p_{h}^{\left( k\right) }}}. \end{aligned}$$

It is now clear the condition for the last quantity to be positive is $v^{\left( k\right) }>0$. Then, under this condition, every step of the iteration increases $\ell (\varvec{p})$. Since $\ell (\varvec{p})$ is clearly bounded from above, the iteration converges.

Next, we show the rate of convergence is linear. We denote the ith component of the solution as $p_{i}^{(*)}$ and use the simpler symbol g to denote the derivative function $ g\left( p_{i}\right) \equiv \frac{\partial \ell (\varvec{p})}{\partial p_{i}}, $ hence $g\left( p_{i}^{\left( *\right) }\right) =0$. We assume $\ell (\varvec{p})$ is locally concave at $\varvec{p}^{\left( *\right) }$ and assume g to be Lipschitz continuous, viz. there exists a positive constant L such that, for all pairs of $\left( p,q\right) $ in the domain, $\left| g\left( p\right) -g\left( q\right) \right| \le L\left| p-q\right| $. Then, we have

$$\begin{aligned} p_{i}^{\left( {k+1}\right) }-p_{i}^{\left( *\right) }= & {} \frac{{a_{i}}}{{\frac{{a_{i}}}{{p_{i}^{\left( k\right) }}}-g\left( {p_{i}^{\left( k\right) }}\right) }}-p_{i}^{\left( *\right) }\\= & {} \frac{{{a_{i}}p_{i}^{\left( k\right) }}}{{{a_{i}}-p_{i}^{\left( k\right) }g\left( {p_{i}^{\left( k\right) }}\right) }}-p_{i}^{\left( *\right) }\\= & {} \frac{{{a_{i}}\left( {p_{i}^{\left( k\right) }-p_{i}^{\left( *\right) }}\right) +p_{i}^{\left( *\right) }p_{i}^{\left( k\right) }g\left( {p_{i}^{\left( k\right) }}\right) }}{{{a_{i}}-p_{i}^{\left( k\right) }g\left( {p_{i}^{\left( k\right) }}\right) }}, \end{aligned}$$

and further,

If $p_{i}^{\left( k\right) }<p_{i}^{\left( *\right) }$, then $g\left( {p_{i}^{\left( k\right) }}\right) >0$ and . Therefore,

If $p_{i}^{\left( k\right) }>p_{i}^{\left( *\right) }$, then $g\left( {p_{i}^{\left( k\right) }}\right) <0$. Therefore, $\frac{{g\left( {p_{i}^{\left( k\right) }}\right) }}{{p_{i}^{\left( k\right) }-p_{i}^{\left( *\right) }}}<0$ and

$$\begin{aligned} {{a_{i}}+p_{i}^{\left( *\right) }p_{i}^{\left( k\right) }\frac{{g\left( {p_{i}^{\left( k\right) }}\right) }}{{p_{i}^{\left( k\right) }-p_{i}^{\left( *\right) }}}}<a_{i}<{{a_{i}}-p_{i}^{\left( k\right) }g\left( {p_{i}^{\left( k\right) }}\right) }. \end{aligned}$$

In both cases, the numerator is smaller than the denominator, hence $\left| \frac{{p_{i}^{\left( {k+1}\right) } -p_{i}^{\left( *\right) }}}{{p_{i}^{\left( k\right) } -p_{i}^{\left( *\right) }}}\right| <1$ and the rate of convergence is linear.

Appendix D: Ranking results of the car racing data

See Table 4.

Table 4 NASCAR2002 car racing data: complete ranking results using the Placket–Luce and Bradley–Terry models

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dong, F., Yin, G. Maximum likelihood estimation for incomplete multinomial data via the weaver algorithm. Stat Comput 28, 1095–1117 (2018). https://doi.org/10.1007/s11222-017-9782-2

Download citation

Received: 28 March 2017
Accepted: 14 October 2017
Published: 27 October 2017
Issue Date: September 2018
DOI: https://doi.org/10.1007/s11222-017-9782-2

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Maximum likelihood estimation for incomplete multinomial data via the weaver algorithm

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A New Two-Parameter Estimator for the Poisson Regression Model

Novel EM based ML Kalman estimation framework for superresolution of stochastic three-states microtubule signal

Model based clustering of multinomial count data

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A: Proof of Lemma 1

Proof

Appendix B: Examples and Corollaries of Lemma 1

Example 5

Example 6

Corollary 1

Corollary 2

Example 7

Corollary 3

Example 8

Corollary 4

Example 9

Appendix C: Proof of the ascent property and the linear rate of convergence of the weaver algorithm when s is sufficiently large

Appendix D: Ranking results of the car racing data

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now