TY - CHAP
T1 - Can Segmentation Models Be Trained with Fully Synthetically Generated Data?
AU - Fernandez, Virginia
AU - Pinaya, Walter Hugo Lopez
AU - Borges, Pedro
AU - Tudosiu, Petru Daniel
AU - Graham, Mark S.
AU - Vercauteren, Tom
AU - Cardoso, M. Jorge
N1 - Publisher Copyright:
© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2022
Y1 - 2022
N2 - In order to achieve good performance and generalisability, medical image segmentation models should be trained on sizeable datasets with sufficient variability. Due to ethics and governance restrictions, and the costs associated with labelling data, scientific development is often stifled, with models trained and tested on limited data. Data augmentation is often used to artificially increase the variability in the data distribution and improve model generalisability. Recent works have explored deep generative models for image synthesis, as such an approach would enable the generation of an effectively infinite amount of varied data, addressing the generalisability and data access problems. However, many proposed solutions limit the user’s control over what is generated. In this work, we propose brainSPADE, a model which combines a synthetic diffusion-based label generator with a semantic image generator. Our model can produce fully synthetic brain labels on-demand, with or without pathology of interest, and then generate a corresponding MRI image of an arbitrary guided style. Experiments show that brainSPADE synthetic data can be used to train segmentation models with performance comparable to that of models trained on real data.
AB - In order to achieve good performance and generalisability, medical image segmentation models should be trained on sizeable datasets with sufficient variability. Due to ethics and governance restrictions, and the costs associated with labelling data, scientific development is often stifled, with models trained and tested on limited data. Data augmentation is often used to artificially increase the variability in the data distribution and improve model generalisability. Recent works have explored deep generative models for image synthesis, as such an approach would enable the generation of an effectively infinite amount of varied data, addressing the generalisability and data access problems. However, many proposed solutions limit the user’s control over what is generated. In this work, we propose brainSPADE, a model which combines a synthetic diffusion-based label generator with a semantic image generator. Our model can produce fully synthetic brain labels on-demand, with or without pathology of interest, and then generate a corresponding MRI image of an arbitrary guided style. Experiments show that brainSPADE synthetic data can be used to train segmentation models with performance comparable to that of models trained on real data.
UR - http://www.scopus.com/inward/record.url?scp=85140443355&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-16980-9_8
DO - 10.1007/978-3-031-16980-9_8
M3 - Conference paper
AN - SCOPUS:85140443355
SN - 9783031169793
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 79
EP - 90
BT - Simulation and Synthesis in Medical Imaging - 7th International Workshop, SASHIMI 2022, Held in Conjunction with MICCAI 2022, Proceedings
A2 - Zhao, Can
A2 - Svoboda, David
A2 - Wolterink, Jelmer M.
A2 - Escobar, Maria
PB - Springer Science and Business Media Deutschland GmbH
T2 - 7th International Workshop on Simulation and Synthesis in Medical Imaging, SASHIMI 2022, held in conjunction with 25th International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2022
Y2 - 18 September 2022 through 18 September 2022
ER -