Knock-Knock: Acoustic Object Recognition by using Stacked Denoising Autoencoders

Research output: Contribution to journalArticlepeer-review

25 Citations (Scopus)
153 Downloads (Pure)


This paper presents a successful application of deep learning for object recognition based on acoustic data. It can restrict capability of the representation to serve different applications and may only capture insignificant characteristics for a task when using handcrafted features. In contrast, there is no need to define the feature representation format when using multilayer/deep architecture methods and features can be learned from raw sensor data without defining discriminative characteristics a-priori. In this paper, stacked denoising autoencoders are applied to train a deep learning model. Thirty different objects were classified in our experiment and each object was knocked 120 times by a marker pen to obtain the auditory data. By employing the proposed deep learning framework, a high accuracy of 91.50% was achieved. The traditional method using handcrafted features with a shallow classifier was taken as a benchmark and the attained recognition rate was only 58.22%. Interestingly, a recognition rate of 82.00% was achieved when using a shallow classifier with raw acoustic data as input. Nevertheless, the time taken for classifying one object using deep learning was far less (6.77 faster) than utilizing this method. It was also explored how different model parameters in deep architecture would affect the recognition performance.
Original languageEnglish
Early online date11 Mar 2017
Publication statusE-pub ahead of print - 11 Mar 2017


  • Object recognition
  • Deep networks
  • Acoustic data analysis


Dive into the research topics of 'Knock-Knock: Acoustic Object Recognition by using Stacked Denoising Autoencoders'. Together they form a unique fingerprint.

Cite this