Machine learning combined with large-scale neuroimaging databases has been proposed as a promising tool for improving our understanding of the behavioural emergence and early prediction of the neurodevelopmental outcome. A recent example of this strategy is a study by Ouyang et al. (2020) which suggested that cortical microstructure quantified by diffusion MRI through fractional anisotropy (FA) metric in preterm and full-term neonates can lead to effective prediction of language and cognitive outcomes at 2 years of corrected age as assessed by Bayley Scales of Infant and Toddler Development, Third Edition (BSID-III) composite scores. Given the important need for robust and generalisable tools which can reliably predict the neurodevelopmental outcome of preterm infants, we aimed to replicate the conclusions of this work using a larger independent dataset from the developing Human Connectome Project dataset (dHCP, third release) with early MRI and BSID-III evaluation at 18 months of corrected age. We then aimed to extend the validation of the proposed predictive pipeline through the study of different cohorts (the largest one included 295 neonates, with gestational age between 29 and 42 week and post-menstrual age at MRI between 31 and 45 weeks). This allowed us to evaluate whether some limitations of the original study (mainly small sample size and limited variability in the input and output features used in the predictive models) would influence the prediction results. In contrast to the original study that inspired the current work, our prediction results did not outcompete the random levels. Furthermore, these negative results persisted even when the study settings were expanded. Our findings suggest that the cortical microstructure close to birth described by DTI-FA measures might not be sufficient for a reliable prediction of BSID-III scores during toddlerhood, at least in the current setting, i.e. generally older cohorts and a different processing pipeline. Our inability to conceptually replicate the results of the original study is in line with the previously reported replicability issues within the machine learning field and demonstrates the challenges in defining the good set of practices for the implementation and validation of reliable predictive tools in the neurodevelopmental (and other) fields.
- Brain development
- DTI (Diffusion tensor imaging)
- ML (machine learning)