TY - JOUR
T1 - Generating Synthetic Labeled Data from Existing Anatomical Models
T2 - An Example with Echocardiography Segmentation
AU - Gilbert, Andrew
AU - Marciniak, Maciej
AU - Rodero, Cristobal
AU - Lamata, Pablo
AU - Samset, Eigil
AU - McLeod, Kristin
N1 - Funding Information:
This work was supported by the European Union's Horizon 2020 research and Innovation program under the Marie Sklodowska-Curie under Grant 764738. The work of Pablo Lamata was supported by the Wellcome Trust Senior Research Fellowship under Grant 209450/Z/17/Z.
Funding Information:
Manuscript received November 24, 2020; revised January 3, 2021; accepted January 11, 2021. Date of publication January 14, 2021; date of current version September 30, 2021. This work was supported by the European Union’s Horizon 2020 research and Innovation program under the Marie Sklodowska-Curie under Grant 764738. The work of Pablo Lamata was supported by the Wellcome Trust Senior Research Fellowship under Grant 209450/Z/17/Z. (Corresponding author: Andrew Gilbert.) Andrew Gilbert and Eigil Samset are with GE Vingmed Ultrasound, GE Healthcare, 3183 Horten, Norway, and also with the Department of Informatics, University of Oslo, 0315 Oslo, Norway (e-mail: [email protected]; [email protected]).
Publisher Copyright:
© 1982-2012 IEEE.
Copyright:
Copyright 2021 Elsevier B.V., All rights reserved.
PY - 2021/10/1
Y1 - 2021/10/1
N2 - Deep learning can bring time savings and increased reproducibility to medical image analysis. However, acquiring training data is challenging due to the time-intensive nature of labeling and high inter-observer variability in annotations. Rather than labeling images, in this work we propose an alternative pipeline where images are generated from existing high-quality annotations using generative adversarial networks (GANs). Annotations are derived automatically from previously built anatomical models and are transformed into realistic synthetic ultrasound images with paired labels using a CycleGAN. We demonstrate the pipeline by generating synthetic 2D echocardiography images to compare with existing deep learning ultrasound segmentation datasets. A convolutional neural network is trained to segment the left ventricle and left atrium using only synthetic images. Networks trained with synthetic images were extensively tested on four different unseen datasets of real images with median Dice scores of 91, 90, 88, and 87 for left ventricle segmentation. These results match or are better than inter-observer results measured on real ultrasound datasets and are comparable to a network trained on a separate set of real images. Results demonstrate the images produced can effectively be used in place of real data for training. The proposed pipeline opens the door for automatic generation of training data for many tasks in medical imaging as the same process can be applied to other segmentation or landmark detection tasks in any modality. The source code and anatomical models are available to other researchers.11https://adgilbert.github.io/data-generation/
AB - Deep learning can bring time savings and increased reproducibility to medical image analysis. However, acquiring training data is challenging due to the time-intensive nature of labeling and high inter-observer variability in annotations. Rather than labeling images, in this work we propose an alternative pipeline where images are generated from existing high-quality annotations using generative adversarial networks (GANs). Annotations are derived automatically from previously built anatomical models and are transformed into realistic synthetic ultrasound images with paired labels using a CycleGAN. We demonstrate the pipeline by generating synthetic 2D echocardiography images to compare with existing deep learning ultrasound segmentation datasets. A convolutional neural network is trained to segment the left ventricle and left atrium using only synthetic images. Networks trained with synthetic images were extensively tested on four different unseen datasets of real images with median Dice scores of 91, 90, 88, and 87 for left ventricle segmentation. These results match or are better than inter-observer results measured on real ultrasound datasets and are comparable to a network trained on a separate set of real images. Results demonstrate the images produced can effectively be used in place of real data for training. The proposed pipeline opens the door for automatic generation of training data for many tasks in medical imaging as the same process can be applied to other segmentation or landmark detection tasks in any modality. The source code and anatomical models are available to other researchers.11https://adgilbert.github.io/data-generation/
KW - Annotations
KW - Data Generation
KW - Echocardiography
KW - Generative Adversarial Networks
KW - Image segmentation
KW - Labeling
KW - Pipelines
KW - Segmentation
KW - Shape
KW - Synthesis
KW - Task analysis
KW - Ultrasonic imaging
UR - http://www.scopus.com/inward/record.url?scp=85099733592&partnerID=8YFLogxK
U2 - 10.1109/TMI.2021.3051806
DO - 10.1109/TMI.2021.3051806
M3 - Article
AN - SCOPUS:85099733592
SN - 0278-0062
VL - 40
SP - 2783
EP - 2794
JO - IEEE Transactions on Medical Imaging
JF - IEEE Transactions on Medical Imaging
IS - 10
ER -