Development and Validation of Predictive Model for a Diagnosis of First Episode Psychosis Using the Multinational EU-GEI Case-control Study and Modern Statistical Learning Methods

Olesya Ajnakina*, Ihsan Fadilah, Diego Quattrone, Celso Arango, Domenico Berardi, Miguel Bernardo, Julio Bobes, Lieuwe De Haan, Cristina Marta Del-Ben, Charlotte Gayer-Anderson, Simona Stilo, Hannah E. Jongsma, Antonio Lasalvia, Sarah Tosato, Pierre Michel Llorca, Paulo Rossi Menezes, Bart P. Rutten, Jose Luis Santos, Julio Sanjuán, Jean Paul SeltenAndrei Szöke, Ilaria Tarricone, Giuseppe D'Andrea, Andrea Tortelli, Eva Velthorst, Peter B. Jones, Manuel Arrojo Romero, Caterina La Cascia, James B. Kirkbride, Jim Van Os, Michael O'Donovan, Craig Morgan, Marta Di Forti, Robin M. Murray, Kathryn Hubbard, Daniel Stahl

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review


Background and Hypothesis: It is argued that availability of diagnostic models will facilitate a more rapid identification of individuals who are at a higher risk of first episode psychosis (FEP). Therefore, we developed, evaluated, and validated a diagnostic risk estimation model to classify individual with FEP and controls across six countries. Study Design: We used data from a large multi-center study encompassing 2627 phenotypically well-defined participants (aged 18-64 years) recruited from six countries spanning 17 research sites, as part of the European Network of National Schizophrenia Networks Studying Gene-Environment Interactions study. To build the diagnostic model and identify which of important factors for estimating an individual risk of FEP, we applied a binary logistic model with regularization by the least absolute shrinkage and selection operator. The model was validated employing the internal-external cross-validation approach. The model performance was assessed with the area under the receiver operating characteristic curve (AUROC), calibration, sensitivity, and specificity. Study Results: Having included preselected 22 predictor variables, the model was able to discriminate adults with FEP and controls with high accuracy across all six countries (rangesAUROC=0.84-0.86). Specificity (range=73.9-78.0%) and sensitivity (range=75.6-79.3%) were equally good, cumulatively indicating an excellent model accuracy; though, calibration slope for the diagnostic model showed a presence of some overfitting when applied specifically to participants from France, the UK, and The Netherlands. Conclusions: The new FEP model achieved a good discrimination and good calibration across six countries with different ethnic contributions supporting its robustness and good generalizability.

Original languageEnglish
Article numbersgad008
JournalSchizophrenia Bulletin Open
Issue number1
Publication statusPublished - 1 Jan 2023


  • cannabis use
  • diagnostic prediction modeling/risk prediction
  • psychosis/diagnostic factors

Cite this