Deep forecasting of translational impact in medical research

Amy P.K. Nelson*, Robert J. Gray, James K. Ruffle, Henry C. Watkins, Daniel Herron, Nick Sorros, Danil Mikhailov, M. Jorge Cardoso, Sebastien Ourselin, Nick McNally, Bryan Williams, Geraint E. Rees, Parashkev Nachev

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

7 Citations (Scopus)


The value of biomedical research—a $1.7 trillion annual investment—is ultimately determined by its downstream, real-world impact, whose predictability from simple citation metrics remains unquantified. Here we sought to determine the comparative predictability of future real-world translation—as indexed by inclusion in patents, guidelines, or policy documents—from complex models of title/abstract-level content versus citations and metadata alone. We quantify predictive performance out of sample, ahead of time, across major domains, using the entire corpus of biomedical research captured by Microsoft Academic Graph from 1990–2019, encompassing 43.3 million papers. We show that citations are only moderately predictive of translational impact. In contrast, high-dimensional models of titles, abstracts, and metadata exhibit high fidelity (area under the receiver operating curve [AUROC] > 0.9), generalize across time and domain, and transfer to recognizing papers of Nobel laureates. We argue that content-based impact models are superior to conventional, citation-based measures and sustain a stronger evidence-based claim to the objective measurement of translational potential.

Original languageEnglish
Article number100483
Issue number5
Publication statusPublished - 13 May 2022


  • deep learning
  • DSML3: Development/Pre-production: Data science output has been rolled out/validated across multiple domains/problems
  • natural language processing
  • representation learning
  • research impact
  • translational research


Dive into the research topics of 'Deep forecasting of translational impact in medical research'. Together they form a unique fingerprint.

Cite this