Abstract
Deep Learning technologies are creating a revolution in the analysis of medical images. We are getting closer to the vision of a fully automated and reliable characterization of disease, both reproducing what the expertise of the radiologist is trained for, and proposing new metrics, revealing new patterns present in the data that are beyond the perceptual limitations of human beings. In this context, this Ph.D. thesis contributes towards this vision with specific solutions for both spatial and temporal analysis of two of the most prevalent modalities of medical imaging in cardiology, Ultrasound (US) and Magnetic Resonance Imaging (MRI).The dissertation is started with a formal mathematical definition of the basic concepts of the most successful deep learning technology in the analysis of medical images: the Convolutional Neural Networks (CNN), specifically their back-propagation algorithm that is used for training. The Ph.D. thesis contributions are essentially based on the design and application of novel architectures of the CNNs. The first research Chapter and scientific publication propose a new deep learning model to extract the fetal aortic signal from an US video sequence. The architecture consists of three fundamental blocks: a convolutional layer for the extraction of imaging features, a Convolution Gated Recurrent Unit (C-GRU) for exploiting the temporal redundancy of a signal, and a novel regularized loss function, called CyclicLoss (CL). The method proposed achieves an accuracy far superior to the state of the art, providing an average reduction of the Mean Square Error (MSE) from 0:31mm2 to 0:09mm2, and execution speed of 289 frames per second.
The rest of the Ph.D. work is focused on the analysis of cardiac MRI. The second Ph.D. research Chapter, and second scientific contribution, proposes a solution for the segmentation of the left ventricle (LV) from CINE MRI images. The main idea is to learn from images acquired through the entire cardiac cycle, instead of simply from keyframes. The workflow consists of three components: first, an automated localization and subsequent cropping of the bounding box containing the cardiac silhouette. Second, we identify the LV contours using a Temporal Fully Convolutional Neural Network (T-FCNN), which extends Fully Convolutional Neural Networks (FCNN) through a recurrent mechanism enforcing temporal coherence across consecutive frames. Finally, we further defined the boundaries using either one of two components: fully-connected Conditional Random Fields (CRFs) with Gaussian edge potentials and Semantic Flow. Our initial experiments suggest that significant improvement in performance (i.e. a 30% reduction in error metrics) can potentially be achieved by using a recurrent neural network component that explicitly learns cardiac motion patterns whilst performing LV segmentation.
The next Chapter and scientific publication propose an architecture called Volumetric Fully Convolution Neural Network (V-FCNN) with the aim of capturing the entire spatial anatomy of the atria in high-resolution MRI. V-FCNN is able to process eighty-eight slices in one-shot on available GPUs and consequently integrating the spatial redundancy through 3D-kernels. Learning outcomes are maximized with a loss function combining MSE and Dice Loss (DL) in order to both capture the bulk shapes and reduce over-segmentation, and training speed and convergence are also improved by removal of the skip-paths. The method achieves a Dice Index of 92.5 in the atrial segmentation task. Finally, the last contribution and publication propose a new network called Region Of Interest Generative Adversarial Network (ROI-GAN) that is tested in the problem of the Right Ventricle (RV) segmentation from MRI. In this context, the work first investigates the optimal combination of three concepts (C-GRU, the Generative Adversarial Networks (GAN), and the L1 loss function), achieving an improvement of 0.05 and 3.49 mm in DL and Hausdorff Distance respectively compared to the baseline FCNN. This improvement is then doubled by the ROI-GAN, that sets two GANs to cooperate working at two fields of view of the image; its full resolution and the region of interest (ROI). The rationale here is to better guide the FCNN learning by combining global (full resolution) and local Region Of Interest (ROI) features. The study is conducted in a large in-house dataset of 23,000 segmented MRI slices, and its generality is verified in a publicly available dataset.
Date of Award | 1 Jul 2019 |
---|---|
Original language | English |
Awarding Institution |
|
Supervisor | Pablo Lamata de la Orden (Supervisor), Giovanni Montana (Supervisor) & Kawal Rhode (Supervisor) |