TY - JOUR
T1 - Sim2Real Transfer of Reinforcement Learning for Concentric Tube Robots
AU - Iyengar, Keshav
AU - Sadati, Hadi
AU - Bergeles, Christos
AU - Sarah , Sarah
AU - Stoyanov, Danail
N1 - Funding Information:
. This work was supported in part by the Wellcome/EPSRC Centre for Interventional and Surgical Sciences (WEISS) at UCL under Grant 203145Z/16/Z, in part by EPSRC under Grants EP/P027938/1 and EP/R004080/1, in part by the Wellcome/EPSRC Centre for Medical Engineering at KCL under Grant WT 203148/Z/16/Z, in part by ERC Starting under Grant 714562, and in part by NIHR Cardiovascular MIC Grant. The work of Danail Stoyanov was supported by the Royal Academy of Engineering Chair in Emerging Techn logies and an EPSRC Early Career Research Fellowship under Grant EP/P012841/1.
Publisher Copyright:
© 2016 IEEE.
PY - 2023/10/1
Y1 - 2023/10/1
N2 - Concentric Tube Robots (CTRs) are promising for minimally invasive interventions due to their miniature diameter, high dexterity, and compliance with soft tissue. CTRs comprise individual pre-curved tubes usually composed of NiTi and are arranged concentrically. As each tube is relatively rotated and translated, the backbone elongates, twists, and bends with a dexterity that is advantageous for confined spaces. Tube interactions, unmodelled phenomena, and inaccurate tube parameter estimation make physical modeling of CTRs challenging, complicating in turn kinematics and control. Deep reinforcement learning (RL) has been investigated as a solution. However, hardware validation has remained a challenge due to differences between the simulation and hardware domains. With simulation-only data, in this work, domain randomization is proposed as a strategy for translation to hardware of a simulation policy with no additionally acquired physical training data. The differences in simulation and hardware forward kinematics accuracy and precision are characterized by errors of 14.74 pm 8.87 mm or 26.61 pm 17.00% robot length. We showcase that the proposed domain randomization approach reduces errors by 56% in mean errors as compared to no domain randomization. Furthermore, we demonstrate path following capability in hardware with a line path with resulting errors of 4.37 pm 2.39 mm or 5.61 pm 3.11% robot length.
AB - Concentric Tube Robots (CTRs) are promising for minimally invasive interventions due to their miniature diameter, high dexterity, and compliance with soft tissue. CTRs comprise individual pre-curved tubes usually composed of NiTi and are arranged concentrically. As each tube is relatively rotated and translated, the backbone elongates, twists, and bends with a dexterity that is advantageous for confined spaces. Tube interactions, unmodelled phenomena, and inaccurate tube parameter estimation make physical modeling of CTRs challenging, complicating in turn kinematics and control. Deep reinforcement learning (RL) has been investigated as a solution. However, hardware validation has remained a challenge due to differences between the simulation and hardware domains. With simulation-only data, in this work, domain randomization is proposed as a strategy for translation to hardware of a simulation policy with no additionally acquired physical training data. The differences in simulation and hardware forward kinematics accuracy and precision are characterized by errors of 14.74 pm 8.87 mm or 26.61 pm 17.00% robot length. We showcase that the proposed domain randomization approach reduces errors by 56% in mean errors as compared to no domain randomization. Furthermore, we demonstrate path following capability in hardware with a line path with resulting errors of 4.37 pm 2.39 mm or 5.61 pm 3.11% robot length.
UR - http://www.scopus.com/inward/record.url?scp=85167818888&partnerID=8YFLogxK
U2 - 10.1109/LRA.2023.3303714
DO - 10.1109/LRA.2023.3303714
M3 - Article
SN - 2377-3766
VL - 8
SP - 6147
EP - 6154
JO - IEEE Robotics and Automation Letters
JF - IEEE Robotics and Automation Letters
IS - 10
ER -