Addressing Limitations in CNN-Based Acoustic Profiling: Enhancing Real-Time Depression Detection with RNN and LSTM Architectures


Date Published : 11 January 2026

Contributors

Prof. (Dr.) Dhananjay S. Deshpande

MBAESG - School of Management, Ajeenkya D Y Patil University, Pune
Author

Shashi Kant Gupta

Chitkara University, Punjab
Author

Sai Kiran Oruganti

Lincoln University College
Author

Keywords

Depression Detection Voice Pattern Analysis Deep Learning Mental Health Screening Real-Time Acoustic Profiling Temporal Modeling CNN Limitations Speech-Based Diagnostics

Proceeding

Track

Engineering, Sciences, Mathematics & Computations

License

Copyright (c) 2026 Sustainable Global Societies Initiative

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Abstract

The human voice carries subtle cues that can indicate emotional well-being, making speech analysis an increasingly valuable tool for detecting early signs of depression. While earlier studies have applied Convolutional Neural Networks (CNNs) to classify acoustic features, these models struggle to interpret the changing flow and timing of speech, elements that are essential to identifying mood variations. In this work, we examine the performance constraints of CNN-based systems and explore Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) architectures as enhanced solutions for real-time depression assessment. By modelling speech as a continuous sequence rather than isolated segments, RNNs and LSTMs can capture temporal patterns linked with depressive behaviour more effectively. Our comparative evaluation shows noticeable improvements in detection accuracy and response latency, demonstrating that temporal modelling plays a critical role in voice-driven mental health screening. These findings provide support for the integration of sequential deep learning models into future clinical and mobile applications aimed at scalable mental health monitoring.

References

No References

Downloads

How to Cite

Deshpande, D., Gupta, S. K. ., & Oruganti, S. K. (2026). Addressing Limitations in CNN-Based Acoustic Profiling: Enhancing Real-Time Depression Detection with RNN and LSTM Architectures. Sustainable Global Societies Initiative, 1(1). https://vectmag.com/sgsi/paper/view/54