International Journal of Computer Techniques Volume 12 Issue 3 | Tracking Student Behavior in Speech Recognition
Tracking Student Behavior in Speech Recognition
International Journal of Computer Techniques – Volume 12 Issue 3, May – June 2025
Abstract
Speech Emotion Recognition (SER) technology enables machines to **detect and classify human emotions from voice data**. This study evaluates various deep learning models, including **CNN, LSTM, ANN, and a hybrid CNN-LSTM model**, to analyze speech-based emotional cues. The **CNN-LSTM hybrid model achieved a 98% classification accuracy**, effectively capturing **short- and long-term dependencies in voice data** using **MFCC features extracted via the Librosa toolkit**.
Keywords
Speech Emotion Recognition, SER, Deep Learning, CNN-LSTM, Machine Learning, MFCC, Acoustic Features, Emotional Intelligence, NLP.
Conclusion
The **CNN-LSTM-based Speech Emotion Recognition model** significantly improves **real-time speech emotion classification**. Its high accuracy of **98%** highlights its potential for applications such as **online education, telemedicine, human-computer interaction, and customer service automation**. Future enhancements may include **multilingual speech emotion recognition and real-time deployment optimization**.
References
- Shaik Abdul Khalandar Basha et al. (2024). “Exploring Deep Learning Methods for Speech Emotion Detection: An Ensemble MFCCs, CNNs, and LSTM.”
- Qianhe Ouyang (2024). “Speech Emotion Detection Based on MFCC and CNN-LSTM Architecture.”
- Chowdhury, J. H., Ramanna, S., & Kotecha, K. (2020). “Speech emotion recognition with lightweight deep neural ensemble model using hand-crafted features.”
Post Comment