TY - JOUR

T1 - Correcting for covariate measurement error in logistic regression using nonparametric maximum likelihood estimation

AU - Rabe-Hesketh, S

AU - Pickles, A

AU - Skrondal, A

PY - 2003

Y1 - 2003

N2 - When covariates are measured with error, inference based on conventional generalized linear models can yield biased estimates of regression parameters. This problem can potentially be rectified by using generalized linear latent and mixed models (GLLAMM), including a measurement model for the relationship between observed and true covariates. However, the models are typically estimated under the assumption that both the true covariates and the measurement errors are normally distributed, although skewed covariate distributions are often observed in practice. In this article we relax the normality assumption for the true covariates by developing nonparametric maximum likelihood estimation (NPMLE) for GLLAMMs. The methodology is applied to estimating the effect of dietary fibre intake on coronary heart disease. We also assess the performance of estimation of regression parameters and empirical Bayes prediction of the true covariate. Normal as well as skewed covariate distributions are simulated and inference is performed based on both maximum likelihood assuming normality and NPMLE. Both estimators are unbiased and have similar root mean square errors when the true covariate is normal. With a skewed covariate, the conventional estimator is biased but has a smaller mean square error than the NPMLE. NPMLE produces substantially improved empirical Bayes predictions of the true covariate when its distribution is skewed.

AB - When covariates are measured with error, inference based on conventional generalized linear models can yield biased estimates of regression parameters. This problem can potentially be rectified by using generalized linear latent and mixed models (GLLAMM), including a measurement model for the relationship between observed and true covariates. However, the models are typically estimated under the assumption that both the true covariates and the measurement errors are normally distributed, although skewed covariate distributions are often observed in practice. In this article we relax the normality assumption for the true covariates by developing nonparametric maximum likelihood estimation (NPMLE) for GLLAMMs. The methodology is applied to estimating the effect of dietary fibre intake on coronary heart disease. We also assess the performance of estimation of regression parameters and empirical Bayes prediction of the true covariate. Normal as well as skewed covariate distributions are simulated and inference is performed based on both maximum likelihood assuming normality and NPMLE. Both estimators are unbiased and have similar root mean square errors when the true covariate is normal. With a skewed covariate, the conventional estimator is biased but has a smaller mean square error than the NPMLE. NPMLE produces substantially improved empirical Bayes predictions of the true covariate when its distribution is skewed.

U2 - 10.1191/1471082X03st056oa

DO - 10.1191/1471082X03st056oa

M3 - Article

VL - 3

SP - 215

EP - 232

JO - STATISTICAL MODELLING

JF - STATISTICAL MODELLING

IS - 3

ER -