Using an Adjustment Training and a Smoothing Mask for Speech Segregation

Yi JIANG, Run-Sheng LIU, Yuan-Yuan ZU


This paper focuses on the improvement of speech intelligibility and nature auditory perception. A dual microphone computational auditory scene analysis (CASA) based speech segregation system is proposed. A deep neural network (DNN) is equipped to estimate the parameter mask, which is used to train a smoothing mask to segregate the target speech from the mixture. A mask smoothing method is proposed to reduce the musical noise, which is caused by estimation errors. The performance of the proposed method is systematic evaluated with the simulated and recording data. The tests show that the proposed method improves the signal to noise ratio (SNR), suppress the musical noise, and has good performance on untrained locations and reverberant test conditions too.


Full Text:



  • There are currently no refbacks.