King's College London

Research portal

Equivalency of the diagnostic accuracy of the PHQ-8 and PHQ-9: A systematic review and individual participant data meta-analysis

Research output: Contribution to journalArticle

Yin Wu, Brooke Levis, Kira Riehm, Nazanin Saadat, Alexander Levis, Marleine Azar, Danielle Rice, Jill Boruff, Pim Cuijpers, Simon Gilbody, JOhn Ioannidis, Lorie Kloda, Dean McMillan, Scott Patten, Ian Shrier, Ray Ziegelstein, Dickens Akena, Bruce Arroll, Liat Ayalon, Hamid Baradaran & 59 others Murray Baron, Charles Bombadier, Peter Butterworth, Gregory Carter, Marcos Chagas, Juliana Chan, Rushina Cholera, Yeates Conwell, Janneke de Man-van Ginkel, Jesse Fann, Felix Fischer, Daniel Fung, Bizu Gelaye, Felicity Goodyear-Smith, Catherine Greeno, Brian Hall, Patricia Harrison, Martin Harter, Ulrich Hegerl, Leanne Hides, Stevan Hobfall, Marie Hudson, Thomas Hyphantis, MD Inagaki, Nathalie Jette, Mohammed Khamseh, Kim Kiely, Yunxin Kwan, Femke Lamers, Shen-Ing Liu, Manote Lotrakul, Sonia Loureiro, Bernd Lowe, Anthony McGuire, Sherina Mohd-Sidik, Tiago Munhoz, Kumiko Muramatsu, Flavia Osorio, Vikram Patel, Brian Pence, Philippe Persoons, Angelo Picardi, Katrin Reuter, Alasdair Rooney, Ina Santos, Juwita Shaaban, Abbey Sidebottom, Adam Simning, MD Stafford, Sharon Sung, Pei Tan, Alyna Turner, Henk van Weert, Jennifer White, Mary Whooley, Kirsty Winkley-Bryant, Mitsuhiko Yamada, Andrea Benedetti, Brett Thombs

Original languageEnglish
Pages (from-to)1-13
JournalPsychological Medicine
Early online date12 Jul 2019
Publication statusE-pub ahead of print - 12 Jul 2019

King's Authors


Item 9 of the Patient Health Questionnaire-9 (PHQ-9) queries about thoughts of death and self-harm, but not suicidality. Although it is sometimes used to assess suicide risk, most positive responses are not associated with suicidality. The PHQ-8, which omits Item 9, is thus increasingly used in research. We assessed equivalency of total score correlations and the diagnostic accuracy to detect major depression of the PHQ-8 and PHQ-9.

We conducted an individual patient data meta-analysis. We fit bivariate random-effects models to assess diagnostic accuracy.

16 742 participants (2097 major depression cases) from 54 studies were included. The correlation between PHQ-8 and PHQ-9 scores was 0.996 (95% confidence interval 0.996 to 0.996). The standard cutoff score of 10 for the PHQ-9 maximized sensitivity + specificity for the PHQ-8 among studies that used a semi-structured diagnostic interview reference standard (N = 27). At cutoff 10, the PHQ-8 was less sensitive by 0.02 (−0.06 to 0.00) and more specific by 0.01 (0.00 to 0.01) among those studies (N = 27), with similar results for studies that used other types of interviews (N = 27). For all 54 primary studies combined, across all cutoffs, the PHQ-8 was less sensitive than the PHQ-9 by 0.00 to 0.05 (0.03 at cutoff 10), and specificity was within 0.01 for all cutoffs (0.00 to 0.01).

PHQ-8 and PHQ-9 total scores were similar. Sensitivity may be minimally reduced with the PHQ-8, but specificity is similar.

View graph of relations

© 2018 King's College London | Strand | London WC2R 2LS | England | United Kingdom | Tel +44 (0)20 7836 5454