King's College London

Research portal

Audio-driven Robot Upper-body Motion Synthesis

Research output: Contribution to journalArticle

Jan Ondras, Oya Celiktutan, Paul Bremner, Hatice Gunes

Original languageEnglish
JournalIEEE Transactions on Cybernetics
Early online date10 Feb 2020
Publication statusE-pub ahead of print - 10 Feb 2020


King's Authors


Body language is an important aspect of human communication, which an effective human-robot interaction interface should mimic well. The currently available robotic platforms are limited in their ability to automatically generate behaviours that align with their speech. In this paper, we developed a neural network based system that takes audio from a user as an input and generates upper-body gestures including head, hand and hip movements of the user on a humanoid robot, namely, Softbank Robotics’ Pepper. The developed system was evaluated quantitatively as well as qualitatively using web-surveys when driven by natural speech and synthetic speech. We particularly compared the impact of generic and person-specific neural network models on the quality of synthesised movements. We further investigated the relationships between quantitative and qualitative evaluations and examined how the speaker’s personality traits affect the synthesised movements.

View graph of relations

© 2018 King's College London | Strand | London WC2R 2LS | England | United Kingdom | Tel +44 (0)20 7836 5454