TY - JOUR
T1 - The meaning of significant mean group differences for biomarker discovery
AU - Loth, Eva
AU - Ahmad, Jumana
AU - Chatham, Chris
AU - Lopez, Beatriz
AU - Carter, Ben
AU - Crawley, Daisy
AU - Oakley, Beth
AU - Hayward, Hannah
AU - Cooke, Jennifer
AU - San José Cáceres, Antonia
AU - Bzdok, Danilo
AU - Jones, Emily
AU - Charman, Tony
AU - Beckmann, Christian F.
AU - Bourgeron, Thomas
AU - Toro, Roberto
AU - Buitelaar, Jan
AU - Murphy, Declan
AU - Dumas, Guillame
N1 - Funding Information:
EL, JA, BL, BC, DC, BO, HH, JC, ASJC, EJ, TC, CB, TB, RT, JB, DM, and GD have received funding from the Innovative Medicines Initiative 2 Joint Undertaking under grant agreement No 777394 for the project AIMS-2-TRIALS. This Joint Undertaking is a joint support from the European Union's Horizon 2020 research and innovation programme, EFPIA, AUTISM SPEAKS, Autistica, and SFARI. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Publisher Copyright:
© 2021 Loth et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
PY - 2021/11/18
Y1 - 2021/11/18
N2 - Over the past decade, biomaker discovery has become a key goal in psychiatry to aid in the more reliable diagnosis and prognosis of heterogeneous psychiatric conditions and the development of tailored therapies. Nevertheless, the prevailing statistical approach is still the mean group comparison between "cases"and "controls,"which tends to ignore withingroup variability. In this educational article, we used empirical data simulations to investigate how effect size, sample size, and the shape of distributions impact the interpretation of mean group differences for biomarker discovery. We then applied these statistical criteria to evaluate biomarker discovery in one area of psychiatric research - autism research. Across the most influential areas of autism research, effect size estimates ranged from small (d = 0.21, anatomical structure) to medium (d = 0.36 electrophysiology, d = 0.5, eye-tracking) to large (d = 1.1 theory of mind). We show that in normal distributions, this translates to approximately 45% to 63% of cases performing within 1 standard deviation (SD) of the typical range, i.e., they do not have a deficit/atypicality in a statistical sense. For a measure to have diagnostic utility as defined by 80% sensitivity and 80% specificity, Cohen's d of 1.66 is required, with still 40% of cases falling within 1 SD. However, in both normal and nonnormal distributions, 1 (skewness) or 2 (platykurtic, bimodal) biologically plausible subgroups may exist despite small or even nonsignificant mean group differences. This conclusion drastically contrasts the way mean group differences are frequently reported. Over 95% of studies omitted the "on average"when summarising their findings in their abstracts ("autistic people have deficits in X"), which can be misleading as it implies that the group-level difference applies to all individuals in that group. We outline practical approaches and steps for researchers to explore mean group comparisons for the discovery of stratification biomarkers.
AB - Over the past decade, biomaker discovery has become a key goal in psychiatry to aid in the more reliable diagnosis and prognosis of heterogeneous psychiatric conditions and the development of tailored therapies. Nevertheless, the prevailing statistical approach is still the mean group comparison between "cases"and "controls,"which tends to ignore withingroup variability. In this educational article, we used empirical data simulations to investigate how effect size, sample size, and the shape of distributions impact the interpretation of mean group differences for biomarker discovery. We then applied these statistical criteria to evaluate biomarker discovery in one area of psychiatric research - autism research. Across the most influential areas of autism research, effect size estimates ranged from small (d = 0.21, anatomical structure) to medium (d = 0.36 electrophysiology, d = 0.5, eye-tracking) to large (d = 1.1 theory of mind). We show that in normal distributions, this translates to approximately 45% to 63% of cases performing within 1 standard deviation (SD) of the typical range, i.e., they do not have a deficit/atypicality in a statistical sense. For a measure to have diagnostic utility as defined by 80% sensitivity and 80% specificity, Cohen's d of 1.66 is required, with still 40% of cases falling within 1 SD. However, in both normal and nonnormal distributions, 1 (skewness) or 2 (platykurtic, bimodal) biologically plausible subgroups may exist despite small or even nonsignificant mean group differences. This conclusion drastically contrasts the way mean group differences are frequently reported. Over 95% of studies omitted the "on average"when summarising their findings in their abstracts ("autistic people have deficits in X"), which can be misleading as it implies that the group-level difference applies to all individuals in that group. We outline practical approaches and steps for researchers to explore mean group comparisons for the discovery of stratification biomarkers.
UR - http://www.scopus.com/inward/record.url?scp=85119933490&partnerID=8YFLogxK
U2 - 10.1371/journal.pcbi.1009477
DO - 10.1371/journal.pcbi.1009477
M3 - Article
SN - 1553-734X
VL - 17
JO - PLoS Computational Biology
JF - PLoS Computational Biology
IS - 11
M1 - e1009477
ER -