High reproducibility of LV mass and volume measurement from cine cardiovascular magnetic resonance (CMR) has been shown within single centers. However, the extent to which contours may vary from center to center, due to different training protocols, is unknown. We aimed to quantify sources of variation between many centers, and provide a multi-center consensus ground truth dataset for benchmarking automated processing tools and facilitating training for new readers in CMR analysis.
Seven independent expert readers, representing seven experienced CMR core laboratories, analyzed fifteen cine CMR data sets in accordance with their standard operating protocols and SCMR guidelines. Consensus contours were generated for each image according to a statistical optimization scheme that maximized contour placement agreement between readers.
Reader-consensus agreement was better than inter-reader agreement (end-diastolic volume 14.7 ml vs 15.2–28.4 ml; end-systolic volume 13.2 ml vs 14.0–21.5 ml; LV mass 17.5 g vs 20.2–34.5 g; ejection fraction 4.2 % vs 4.6–7.5 %). Compared with consensus contours, readers were very consistent (small variability across cases within each reader), but bias varied between readers due to differences in contouring protocols at each center. Although larger contour differences were found at the apex and base, the main effect on volume was due to small but consistent differences in the position of the contours in all regions of the LV.
A multi-center consensus dataset was established for the purposes of benchmarking and training. Achieving consensus on contour drawing protocol between centers before analysis, or bias correction after analysis, is required when collating multi-center results.