TY - JOUR
T1 - The effect of genome-wide association scan quality control on imputation outcome for common variants
AU - Southam, Lorraine
AU - Panoutsopoulou, Kalliope
AU - Rayner, N. William
AU - Chapman, Kay
AU - Durrant, Caroline
AU - Ferreira, Teresa
AU - Arden, Nigel
AU - Carr, Andrew
AU - Deloukas, Panos
AU - Doherty, Michael
AU - Loughlin, John
AU - McCaskie, Andrew
AU - Ollier, William E. R.
AU - Ralston, Stuart
AU - Spector, Timothy D.
AU - Valdes, Ana M.
AU - Wallis, Gillian A.
AU - Wilkinson, J. Mark
AU - Marchini, Jonathan
AU - Zeggini, Eleftheria
PY - 2011/5
Y1 - 2011/5
N2 - Imputation is an extremely valuable tool in conducting and synthesising genome-wide association studies (GWASs). Directly typed SNP quality control (QC) is thought to affect imputation quality. It is, therefore, common practise to use quality-controlled (QCed) data as an input for imputing genotypes. This study aims to determine the effect of commonly applied QC steps on imputation outcomes. We performed several iterations of imputing SNPs across chromosome 22 in a dataset consisting of 3177 samples with Illumina 610k (Illumina, San Diego, CA, USA) GWAS data, applying different QC steps each time. The imputed genotypes were compared with the directly typed genotypes. In addition, we investigated the correlation between alternatively QCed data. We also applied a series of post-imputation QC steps balancing elimination of poorly imputed SNPs and information loss. We found that the difference between the unQCed data and the fully QCed data on imputation outcome was minimal. Our study shows that imputation of common variants is generally very accurate and robust to GWAS QC, which is not a major factor affecting imputation outcome. A minority of common-frequency SNPs with particular properties cannot be accurately imputed regardless of QC stringency. These findings may not generalise to the imputation of low frequency and rare variants. European Journal of Human Genetics (2011) 19, 610-614; doi:10.1038/ejhg.2010.242; published online 26 January 2011
AB - Imputation is an extremely valuable tool in conducting and synthesising genome-wide association studies (GWASs). Directly typed SNP quality control (QC) is thought to affect imputation quality. It is, therefore, common practise to use quality-controlled (QCed) data as an input for imputing genotypes. This study aims to determine the effect of commonly applied QC steps on imputation outcomes. We performed several iterations of imputing SNPs across chromosome 22 in a dataset consisting of 3177 samples with Illumina 610k (Illumina, San Diego, CA, USA) GWAS data, applying different QC steps each time. The imputed genotypes were compared with the directly typed genotypes. In addition, we investigated the correlation between alternatively QCed data. We also applied a series of post-imputation QC steps balancing elimination of poorly imputed SNPs and information loss. We found that the difference between the unQCed data and the fully QCed data on imputation outcome was minimal. Our study shows that imputation of common variants is generally very accurate and robust to GWAS QC, which is not a major factor affecting imputation outcome. A minority of common-frequency SNPs with particular properties cannot be accurately imputed regardless of QC stringency. These findings may not generalise to the imputation of low frequency and rare variants. European Journal of Human Genetics (2011) 19, 610-614; doi:10.1038/ejhg.2010.242; published online 26 January 2011
U2 - 10.1038/ejhg.2010.242
DO - 10.1038/ejhg.2010.242
M3 - Article
VL - 19
SP - 610
EP - 614
JO - European Journal of Human Genetics
JF - European Journal of Human Genetics
IS - 5
ER -