Since the rise of deep learning (DL) in the mid-2010s, cardiac magnetic resonance (CMR) image segmentation has achieved state-of-the-art performance. Despite achieving inter-observer variability in terms of different accuracy performance measures, visual inspections reveal errors in most segmentation results, indicating a lack of reliability and robustness of DL segmentation models, which can be critical if a model was to be deployed into clinical practice. In this work, we aim to bring attention to reliability and robustness, two unmet needs of cardiac image segmentation methods, which are hampering their translation into practice. To this end, we first study the performance accuracy evolution of CMR segmentation, illustrate the improvements brought by DL algorithms and highlight the symptoms of performance stagnation. Afterwards, we provide formal definitions of reliability and robustness. Based on the two definitions, we identify the factors that limit the reliability and robustness of state-of-the-art deep learning CMR segmentation techniques. Finally, we give an overview of the current set of works that focus on improving the reliability and robustness of CMR segmentation, and we categorize them into two families of methods: quality control methods and model improvement techniques. The first category corresponds to simpler strategies that only aim to flag situations where a model may be incurring poor reliability or robustness. The second one, instead, directly tackles the problem by bringing improvements into different aspects of the CMR segmentation model development process. We aim to bring the attention of more researchers towards these emerging trends regarding the development of reliable and robust CMR segmentation frameworks, which can guarantee the safe use of DL in clinical routines and studies.
- cardiac image segmentation
- cardiac magnetic resonance imaging
- deep learning
- reliability and robustness