King's College London

Research portal

Two-stage sampling in the estimation of growth parameters and percentile norms: sample weights versus auxiliary variable estimation

Research output: Contribution to journalArticlepeer-review

George Vamvakas, Courtenay Norbury, Andrew Pickles

Original languageEnglish
Article number173
JournalBMC Medical Research Methodology
Volume21
Issue number1
DOIs
PublishedDec 2021

Bibliographical note

Funding Information: GV is supervised by AP and CN. AP is part funded by the NIHR Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and KCL and Senior Investigator award, NF-SI-0617-10120. SCALES was funded by Wellcome WT094836AIA; SCALES 2 by Economic and Social Research Council ES/R003041. Funding Information: This study represents independent research funded by Wellcome (WT094836AIA) and ESRC (ES/R003041). Additional support was provided National Institute for Health Research (NIHR) through the Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King?s College London and Senior Investigator award NF-SI-0617-10120. The views expressed in this paper are those of the authors and not necessarily those of Surrey County Council, the Wellcome Trust, ESRC, NIHR or the Department of Health. We are grateful to all of the children, schools, and teachers who have given their time to support this research. Funding Information: This project has used data from SCALES and SCALES 2, which have ethical approval from the University College London Research Ethics Committee. SCALES represents independent research funded by Wellcome (WT094836AIA) and the National Institute for Health Research (NIHR) Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King’s College London. SCALES 2 is funded by the ESRC (REF ES/R003041/1). Informed consent was collected from parents/guardians before in-depth assessments in Year 1 and Year 6. Informed assent was collected from children prior to each assessment. Funding Information: This study represents independent research funded by Wellcome (WT094836AIA) and ESRC (ES/R003041). Additional support was provided National Institute for Health Research (NIHR) through the Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King’s College London and Senior Investigator award NF-SI-0617-10120. The views expressed in this paper are those of the authors and not necessarily those of Surrey County Council, the Wellcome Trust, ESRC, NIHR or the Department of Health. We are grateful to all of the children, schools, and teachers who have given their time to support this research. Publisher Copyright: © 2021, The Author(s). Copyright: Copyright 2021 Elsevier B.V., All rights reserved.

King's Authors

Abstract

Background: The use of auxiliary variables with maximum likelihood parameter estimation for surveys that miss data by design is not a widespread approach, despite its documented improved efficiency over traditional approaches that deploy sampling weights. Although efficiency gains from the use of Normally distributed auxiliary variables in a model have been recorded in the literature, little is known about the effects of non-Normal auxiliary variables in the parameter estimation. Methods: We simulate growth data to mimic SCALES, a two-stage survey of language development with a screening phase (stage one) for which data are observed for the whole sample and an intensive assessments phase (stage two), for which data are observed for a sub-sample, selected using stratified random sampling. In the simulation, we allow a fully observed Poisson distributed stratification criterion to be correlated with the partially observed model responses and develop five generalised structural equation growth models that host the auxiliary information from this criterion. We compare these models with each other and with a weighted growth model in terms of bias, efficiency, and coverage. We finally apply our best performing model to SCALES data and show how to obtain growth parameters and population norms. Results: Parameter estimation from a model that incorporates a non-Normal auxiliary variable is unbiased and more efficient than its weighted counterpart. The auxiliary variable method is capable of producing efficient population percentile norms and velocities. Conclusions: The deployment of a fully observed variable that dominates the selection of the sample and correlates strongly with the incomplete variable of interest appears beneficial for the estimation process.

View graph of relations

© 2020 King's College London | Strand | London WC2R 2LS | England | United Kingdom | Tel +44 (0)20 7836 5454