King's College London

Research portal

Avoiding big data pitfalls

Research output: Contribution to journalArticlepeer-review

Original languageEnglish
Pages (from-to)33-35
Number of pages3
JournalHeart and Metabolism
Issue number82


  • Lamata_Avoiding Big Data Pitfalls

    Lamata_Avoiding_Big_Data_Pitfalls.pdf, 50.6 KB, application/pdf

    Uploaded date:15 Apr 2021

    Version:Final published version

    Licence:CC BY

    Journal close down, right to publish granted

King's Authors


Clinical decisions are based on a combination of inductive inference built on experience (ie, statistical models) and on deductions provided by our understanding of the workings of the cardiovascular system (ie, mechanistic models). In a similar way, computers can be used to discover new hidden patterns in the (big) data and to make predictions based on our knowledge of physiology or physics. Surprisingly, unlike humans through history, computers seldom combine inductive and deductive processes. An explosion of expectations surrounds the computer's inductive method, fueled by the "big data" and popular trends. This article reviews the risks and potential pitfalls of this computer approach, where the lack of generality, selection or confounding biases, overfitting, or spurious correlations are among the commonplace flaws. Recommendations to reduce these risks include an examination of data through the lens of causality, the careful choice and description of statistical techniques, and an open research culture with transparency. Finally, the synergy between mechanistic and statistical models (ie, the digital twin) is discussed as a promising pathway toward precision cardiology that mimics the human experience.

View graph of relations

© 2020 King's College London | Strand | London WC2R 2LS | England | United Kingdom | Tel +44 (0)20 7836 5454