Home > Published Issues > 2025 > Volume 20, No. 1, 2025 >
JCM 2025 Vol.20(1): 48-53
Doi: 10.12720/jcm.20.1.48-53

Deep Speak Net: Advancing Speech Separation with Deep Learning Techniques

Jaipreet Kour Wazir and Javaid A. Sheikh*
Department of Electronics and IT, University of Kashmir, India
Email: jaipreet.elscholar@kashmiruniversity.net (J.K.W.); sheikhjavaid@uok.edu.in (J.A.S.)
*Corresponding author

Manuscript received September 14, 2024; revised October 28, 2024; accepted November 19, 2024; published February 8, 2025.

Abstract—Most prior research in single speech separation has utilized time-frequency domain representation of the mixed signal. However, this approach has several drawbacks, such as separation of magnitude and phase and suboptimal speech separation quality. Our work proposes an improved model based on Conv-Tasnet architecture, which uses a full convolution time–domain auto-encoder for speech separation. The proposed model addresses previous works’ drawbacks by improving the auto-encoder’s structure to enhance the speech separation process and quality. The significant contribution of the proposed work is to preserve the magnitude and phase integrity through time-domain representation; for better gradient flow and stability, we have employed residual and skip connections. Furthermore, we have experimented with the data set Texas Instruments/Massachusetts Institute of Technology (TIMIT); Gaussian noise is added to the data set, with Signal-to-Noise Ratio (SNR) levels from −5db to 5db randomly, and the results demonstrate the improved Perceptual Evaluation of Speech Quality (PESQ), Scale-Invariant Signal-to-Noise Ratio Improvement (SI-SNRI), Scale-Invariant Signal-to- Distortion Ratio Improvement (SDRI), Signal to Distortion Ratio (STOI). The proposed model, Deep Speak net, performs state-of-the-art approaches, increases computational efficiency, and is suitable for real-time applications.

Keywords—single channel, speech separation, deep neural network

Cite: Jaipreet Kour Wazir and Javaid A. Sheikh, “Deep Speak Net: Advancing Speech Separation with Deep Learning Techniques," Journal of Communications, vol. 20, no. 1, pp. 48-53, 2025.

Copyright © 2025 by the authors. This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).