Nonparametric estimation of the mixing distribution in mixed models with random intercepts and slopes




Saab, Rabih

Journal Title

Journal ISSN

Volume Title



Generalized linear mixture models (GLMM) are widely used in statistical applications to model count and binary data. We consider the problem of nonparametric likelihood estimation of mixing distributions in GLMM's with multiple random effects. The log-likelihood to be maximized has the general form l(G)=Σi log∫f(yi,γ) dG(γ) where f(.,γ) is a parametric family of component densities, yi is the ith observed response dependent variable, and G is a mixing distribution function of the random effects vector γ defined on Ω. The literature presents many algorithms for maximum likelihood estimation (MLE) of G in the univariate random effect case such as the EM algorithm (Laird, 1978), the intra-simplex direction method, ISDM (Lesperance and Kalbfleish, 1992), and vertex exchange method, VEM (Bohning, 1985). In this dissertation, the constrained Newton method (CNM) in Wang (2007), which fits GLMM's with random intercepts only, is extended to fit clustered datasets with multiple random effects. Owing to the general equivalence theorem from the geometry of mixture likelihoods (see Lindsay, 1995), many NPMLE algorithms including CNM and ISDM maximize the directional derivative of the log-likelihood to add potential support points to the mixing distribution G. Our method, Direct Search Directional Derivative (DSDD), uses a directional search method to find local maxima of the multi-dimensional directional derivative function. The DSDD's performance is investigated in GLMM where f is a Bernoulli or Poisson distribution function. The algorithm is also extended to cover GLMM's with zero-inflated data. Goodness-of-fit (GOF) and selection methods for mixed models have been developed in the literature, however their application in models with nonparametric random effects distributions is vague and ad-hoc. Some popular measures such as the Deviance Information Criteria (DIC), conditional Akaike Information Criteria (cAIC) and R2 statistics are potentially useful in this context. Additionally, some cross-validation goodness-of-fit methods popular in Bayesian applications, such as the conditional predictive ordinate (CPO) and numerical posterior predictive checks, can be applied with some minor modifications to suit the non-Bayesian approach.



Nonparametric maximum likelihood estimation, geometry of mixture likelihood, Poisson mixture model, Bernoulli mixture model, Direct Search, directional derivative, Zero-inflated data, goodness-of-fit, model selection