King's College London

Research portal

Deep learning-based monocular placental pose estimation: towards collaborative robotics in fetoscopy

Research output: Contribution to journalArticle

Mirza Awais Ahmad, Mouloud Ourak, Caspar Gruijthuijsen, Jan Deprest, Tom Vercauteren, Emmanuel Vander Poorten

Original languageEnglish
JournalInternational Journal of Computer Assisted Radiology and Surgery
Publication statusE-pub ahead of print - 30 Apr 2020

King's Authors


PURPOSE: Twin-to-twin transfusion syndrome (TTTS) is a placental defect occurring in monochorionic twin pregnancies. It is associated with high risks of fetal loss and perinatal death. Fetoscopic elective laser ablation (ELA) of placental anastomoses has been established as the most effective therapy for TTTS. Current tools and techniques face limitations in case of more complex ELA cases. Visualization of the entire placental surface and vascular equator; maintaining an adequate distance and a close to perpendicular angle between laser fiber and placental surface are central for the effectiveness of laser ablation and procedural success. Robot-assisted technology could address these challenges, offer enhanced dexterity and ultimately improve the safety and effectiveness of the therapeutic procedures.

METHODS: This work proposes a 'minimal' robotic TTTS approach whereby rather than deploying a massive and expensive robotic system, a compact instrument is 'robotised' and endowed with 'robotic' skills so that operators can quickly and efficiently use it. The work reports on automatic placental pose estimation in fetoscopic images. This estimator forms a key building block of a proposed shared-control approach for semi-autonomous fetoscopy. A convolutional neural network (CNN) is trained to predict the relative orientation of the placental surface from a single monocular fetoscope camera image. To overcome the absence of real-life ground-truth placenta pose data, similar to other works in literature (Handa et al. in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2016; Gaidon et al. in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2016; Vercauteren et al. in: Proceedings of the IEEE, 2019) the network is trained with data generated in a simulated environment and an in-silico phantom model. A limited set of coarsely manually labeled samples from real interventions are added to the training dataset to improve domain adaptation.

RESULTS: The trained network shows promising results on unseen samples from synthetic, phantom and in vivo patient data. The performance of the network for collaborative control purposes was evaluated in a virtual reality simulator in which the virtual flexible distal tip was autonomously controlled by the neural network.

CONCLUSION: Improved alignment was established compared to manual operation for this setting, demonstrating the feasibility to incorporate a CNN-based estimator in a real-time shared control scheme for fetoscopic applications.

View graph of relations

© 2018 King's College London | Strand | London WC2R 2LS | England | United Kingdom | Tel +44 (0)20 7836 5454