OBJECTIVES: To clarify real-world linguistic nuances around dying in hospital as well as inaccuracy in individual-level prognostication to support advance care planning and personalised discussions on limitation of life sustaining treatment (LST).

DESIGN: Retrospective cross-sectional study of real-world clinical data.

SETTING: Secondary care, urban and suburban teaching hospitals.

PARTICIPANTS: All inpatients in 12-month period from 1 October 2018 to 30 September 2019.

METHODS: Using unsupervised natural language processing, word embedding in latent space was used to generate phrase clusters with most similar semantic embeddings to 'Ceiling of Treatment' and their prognostication value.

RESULTS: Word embeddings with most similarity to 'Ceiling of Treatment' clustered around phrases describing end-of-life care, ceiling of care and LST discussions. The phrases have differing prognostic profile with the highest 7-day mortality in the phrases most explicitly referring to end of life-'Withdrawal of care' (56.7%), 'terminal care/end of life care' (57.5%) and 'un-survivable' (57.6%).

CONCLUSION: Vocabulary used at end-of-life discussions are diverse and has a range of associations to 7-day mortality. This highlights the importance of correct application of terminology during LST and end-of-life discussions.

Original languageEnglish
JournalBMJ health & care informatics
Issue number1
Publication statusPublished - Oct 2021


  • Cross-Sectional Studies
  • Death
  • Delivery of Health Care/statistics & numerical data
  • Humans
  • Natural Language Processing
  • Retrospective Studies


Dive into the research topics of 'Natural language word embeddings as a glimpse into healthcare language and associated mortality surrounding end of life'. Together they form a unique fingerprint.

Cite this