Abstract
This paper presents a successful application of deep learning for object recognition based on acoustic data. It can restrict capability of the representation to serve different applications and may only capture insignificant characteristics for a task when using handcrafted features. In contrast, there is no need to define the feature representation format when using multilayer/deep architecture methods and features can be learned from raw sensor data without defining discriminative characteristics a-priori. In this paper, stacked denoising autoencoders are applied to train a deep learning model. Thirty different objects were classified in our experiment and each object was knocked 120 times by a marker pen to obtain the auditory data. By employing the proposed deep learning framework, a high accuracy of 91.50% was achieved. The traditional method using handcrafted features with a shallow classifier was taken as a benchmark and the attained recognition rate was only 58.22%. Interestingly, a recognition rate of 82.00% was achieved when using a shallow classifier with raw acoustic data as input. Nevertheless, the time taken for classifying one object using deep learning was far less (6.77 faster) than utilizing this method. It was also explored how different model parameters in deep architecture would affect the recognition performance.
Original language | English |
---|---|
Journal | NEUROCOMPUTING |
Early online date | 11 Mar 2017 |
DOIs | |
Publication status | E-pub ahead of print - 11 Mar 2017 |
Keywords
- Object recognition
- Deep networks
- Acoustic data analysis