Applied Mathematics & Information Sciences

The Robust Spectral Audio Features for Speech Emotion Recognition

Aisultan Shoiynbek, Suleyman Demirel University, Kaskelen, KazakhstanFollow
Kanat Kozhakhmet, Suleyman Demirel University, Kaskelen, KazakhstanFollow
Nazerke Sultanova, Suleyman Demirel University, Kaskelen, KazakhstanFollow
Rakhima Zhumaliyeva, Suleyman Demirel University, Kaskelen, KazakhstanFollow

Author Country (or Countries)

Kazakhstan

Abstract

This paper describes a revealing robust spectral feature for speech emotion recognition using Deep Neural Network (DNN) architecture with six fully-connected layers. We have used 3 class subset (angry, neutral, sad) of German corpus (Berlin database of emotional speech) containing 271 labeled recordings with a total length of 783 seconds. All data was divided into TRAIN (80 %) VALIDATION (10 %) and TESTING (10 %) sets. DNN is optimized using Stochastic Gradient Descent. And we have used batch normalization. As input, fourteen features were used and supported by the LIBROSSA library. Features are compared between each other. In accordance with the experiment we have discovered that MFCC with 100 percent accuracy is a reliable function for the task of recognizing emotions.

Digital Object Identifier (DOI)

http://dx.doi.org/10.18576/amis/130521

Recommended Citation

Shoiynbek, Aisultan; Kozhakhmet, Kanat; Sultanova, Nazerke; and Zhumaliyeva, Rakhima (2019) "The Robust Spectral Audio Features for Speech Emotion Recognition," Applied Mathematics & Information Sciences: Vol. 13: Iss. 5, Article 21.
DOI: http://dx.doi.org/10.18576/amis/130521
Available at: https://digitalcommons.aaru.edu.jo/amis/vol13/iss5/21

Download

COinS

Applied Mathematics & Information Sciences

The Robust Spectral Audio Features for Speech Emotion Recognition

Authors

Author Country (or Countries)

Abstract

Digital Object Identifier (DOI)

Recommended Citation

Share

Search