Abstract
Objective
Dynamic magnetic resonance (MR) imaging enables visualisation of articulators during speech. There is growing interest in quantifying articulator motion in two-dimensional MR images of the vocal tract, to better understand speech production and potentially inform patient management decisions. Image registration is an established way to achieve this quantification. Recently, segmentation-informed deformable registration frameworks have been developed and have achieved state-of-the-art accuracy. This work aims to adapt such a framework and optimise it for estimating displacement fields between dynamic two-dimensional MR images of the vocal tract during speech.
Methods
A deep-learning-based registration framework was developed and compared with current state-of-the-art registration methods and frameworks (two traditional methods and three deep-learning-based frameworks, two of which are segmentation informed). The accuracy of the methods and frameworks was evaluated using the Dice coefficient (DSC), average surface distance (ASD) and a metric based on velopharyngeal closure. The metric evaluated if the fields captured a clinically relevant and quantifiable aspect of articulator motion.
Results
The segmentation-informed frameworks achieved higher DSCs and lower ASDs and captured more velopharyngeal closures than the traditional methods and the framework that was not segmentation informed. All segmentation-informed frameworks achieved similar DSCs and ASDs. However, the proposed framework captured the most velopharyngeal closures.
Conclusions
A framework was successfully developed and found to more accurately estimate articulator motion than five current state-of-the-art methods and frameworks.
Significance
The first deep-learning-based framework specifically for registering dynamic two-dimensional MR images of the vocal tract during speech has been developed and evaluated.
Dynamic magnetic resonance (MR) imaging enables visualisation of articulators during speech. There is growing interest in quantifying articulator motion in two-dimensional MR images of the vocal tract, to better understand speech production and potentially inform patient management decisions. Image registration is an established way to achieve this quantification. Recently, segmentation-informed deformable registration frameworks have been developed and have achieved state-of-the-art accuracy. This work aims to adapt such a framework and optimise it for estimating displacement fields between dynamic two-dimensional MR images of the vocal tract during speech.
Methods
A deep-learning-based registration framework was developed and compared with current state-of-the-art registration methods and frameworks (two traditional methods and three deep-learning-based frameworks, two of which are segmentation informed). The accuracy of the methods and frameworks was evaluated using the Dice coefficient (DSC), average surface distance (ASD) and a metric based on velopharyngeal closure. The metric evaluated if the fields captured a clinically relevant and quantifiable aspect of articulator motion.
Results
The segmentation-informed frameworks achieved higher DSCs and lower ASDs and captured more velopharyngeal closures than the traditional methods and the framework that was not segmentation informed. All segmentation-informed frameworks achieved similar DSCs and ASDs. However, the proposed framework captured the most velopharyngeal closures.
Conclusions
A framework was successfully developed and found to more accurately estimate articulator motion than five current state-of-the-art methods and frameworks.
Significance
The first deep-learning-based framework specifically for registering dynamic two-dimensional MR images of the vocal tract during speech has been developed and evaluated.
Original language | English |
---|---|
Article number | 104290 |
Journal | Biomedical Signal Processing and Control |
Volume | 80 |
Issue number | 1 |
DOIs | |
Publication status | Published - Feb 2023 |