Deep-Learning-Based Methods for Automatic Articulator and Levator Veli Palatini Segmentation and Motion Quantification in Magnetic Resonance Images of the Vocal Tract

Student thesis: Doctoral ThesisDoctor of Philosophy

Abstract

Articulators such as the soft palate play an essential role in the production of speech. In combination with the levator veli palatini (LVP), the soft palate causes velopharyngeal closure, a key requirement for the production of most speech sounds.

Velopharyngeal insufficiency (VPI) is an anatomical or structural defect that prevents velopharyngeal closure and consequently impairs speech. While several well-established surgical techniques to treat VPI exist, there is currently no consensus on which is most effective and consequently a variety of techniques are used. In addition, treatment is not always successful and further surgery is required.

Typically in clinical assessments of speech, imaging is used to enable identification of the defects preventing velopharyngeal closure and inform the choice of treatment. While currently the most commonly used imaging techniques are videofluoroscopy and nasendoscopy, use of magnetic resonance imaging (MRI) is increasing due to its unique ability to dynamically image the articulators during speech and acquire detailed three-dimensional (3D) images of the LVP. In addition, there is increasing interest in extracting quantitative information about the vocal tract, articulators and LVP from the images. The work presented in this thesis makes several contributions towards addressing the unmet need for this quantitative information.

Segmentation of medical images is a common first step to enable automatic measurement of anatomical features. In the work presented in this thesis, two segmentation methods, both of them deep learning based, were developed and evaluated. One method segments the vocal tract, soft palate and four other relevant anatomical features in two-dimensional (2D) magnetic resonance (MR) images of speech. At the time it was published, the method overcame the limitations of existing segmentation methods that either only segmented air-tissue boundaries between the vocal tract and adjacent tissues or only fully segmented the vocal tract. The other method segments the LVP and pharynx in 3D MR images of the vocal tract.

In addition, a framework for quantification of articulator motion in 2D MR images of speech was developed and evaluated. This deep learning framework for nonlinear registration builds on the 2D image segmentation method by employing knowledge of region boundaries as well as images to estimate displacement fields between 2D MR images of speech. The framework was compared with several state-of-the-art traditional registration methods and deep learning frameworks for nonlinear registration and found to estimate displacement fields that more accurately captured velopharyngeal closures.

To enable the development and evaluation of the segmentation methods and motion quantification framework, a new dataset of 15 3D MR images of the vocal tract was acquired and ground-truth (GT) segmentations were created for it and an existing dataset of 392 2D MR images of speech. Prior to acquiring the new dataset, an investigation was performed to identify the parameters that resulted in the optimal image contrast for LVP visualisation.

To be suitable for use in clinical speech assessment, a key requirement of segmentation and motion quantification methods is that they capture any velopharyngeal closures that occur. Since standard evaluation metrics do not provide such information, a novel metric based on velopharyngeal closure was developed to enable more clinically relevant evaluation. Particularly in the comparison of motion quantification frameworks, the metric revealed differences between the frameworks that standard metrics did not.

To conclude, while future work is required to fully address the unmet need for quantitative information about the vocal tract, soft palate and LVP in MR images, the work presented in this thesis has nevertheless contributed towards addressing this need and created several new opportunities to contribute to the ultimate goal of improving the treatment outcomes of patients with VPI.
Date of Award1 Oct 2023
Original languageEnglish
Awarding Institution
  • King's College London
SupervisorAndrew King (Supervisor) & Marc Miquel (Supervisor)

Cite this

'