Best practices in the real-world data life cycle

Joe Zhang, Joshua Symons, Paul Agapow, James T. Teo, Claire A Paxton, Jordan Abdi, Heather Mattie, Charlie Davie, Aracelis Z. Torres, Amos Folarin, Harpreet Sood, Leo A. Celi, John Halamka, Sara Eapen, Sanjay Budhdeo

Research output: Contribution to journalArticlepeer-review

31 Downloads (Pure)


With increasing digitization of healthcare, real-world data (RWD) are available in greater quantity and scope than ever before. Since the 2016 United States 21st Century Cures Act, innovations in the RWD life cycle have taken tremendous strides forward, largely driven by demand for regulatory-grade real-world evidence from the biopharmaceutical sector. However, use cases for RWD continue to grow in number, moving beyond drug development, to population health and direct clinical applications pertinent to payors, providers, and health systems. Effective RWD utilization requires disparate data sources to be turned into high-quality datasets. To harness the potential of RWD for emerging use cases, providers and organizations must accelerate life cycle improvements that support this process. We build on examples obtained from the academic literature and author experience of data curation practices across a diverse range of sectors to describe a standardized RWD life cycle containing key steps in production of useful data for analysis and insights. We delineate best practices that will add value to current data pipelines. Seven themes are highlighted that ensure sustainability and scalability for RWD life cycles: data standards adherence, tailored quality assurance, data entry incentivization, deploying natural language processing, data platform solutions, RWD governance, and ensuring equity and representation in data.
Original languageEnglish
Pages (from-to)1-14
Number of pages14
JournalPLOS Digital Health
Issue number1
Publication statusPublished - 18 Jan 2022


Dive into the research topics of 'Best practices in the real-world data life cycle'. Together they form a unique fingerprint.

Cite this