- This event has passed.
Thesis Colloquium of Anurenjan P. R.
September 9, 2022 @ 2:30 pm - 3:30 pm UTC+0
Teams link – https://tinyurl.com/3bpmjxkx .
Title: Dereverberation of speech using frequency domain linear prediction
Faculty Advisor: Dr. Sriram Ganapathy
Abstract : The speech-based technologies are radically changing the way we interact with systems and how we access information. In many of these applications, the users prefer to interact with the system through a far-field microphone without the nuance of a handheld or body-worn device. Examples of such applications are automated meeting analysis, speech-based dictation systems, hands-free interfaces for controlling consumer-products, IoT, virtual assistants in mobile phones and smart speakers. The major challenge in capturing speech from the far-field is the degradation of the signal quality due to reverberation. Reverberation refers to the delayed and weighted summation of the direct component of the speech signal with the reflected versions. This talk is focused on developing methods for speech dereverberation, i.e., restoring the functional quality of reverberated speech, using the signal analysis technique of frequency domain linear prediction (FDLP).
The FDLP is the frequency domain dual of the conventional Time Domain Linear Prediction (TDLP). Just as the TDLP estimates the spectrum of a signal, the FDLP estimates the temporal envelopes of the signal using an autoregressive model. We apply the FDLP approach to the sub-bands of speech signal that are distributed in the mel scale.
This talk will describe two broad directions for addressing issues in the far-field speech using the FDLP approach. In the first part of the talk, we explore a front-end design for automatic speech recognition (ASR) applications that suppresses the reverberation artifacts in the FDLP envelope. In the second part of the thesis, we develop a speech enhancement model using the envelope and carrier decomposition given by the FDLP technique.
In the design of the ASR front end, I will discuss a novel approach for 3-D acoustic modeling framework, where the spatio-spectral features from all the sub-band channels are extracted. The features that are input to the 3-D CNN are extracted by modeling the signal peaks in the spatio-spectral domain using a multi-variate autoregressive modeling approach. In the subsequent part of this section, I will describe a neural model for speech dereverberation using the long-term sub-band envelopes of speech. The neural dereverberation model estimates the envelope gain, which when applied to reverberant signals, allows the suppression of the late reflection components. The de-reverberated envelopes are used for feature extraction in speech recognition. The key novelty in this model is the joint learning of the reverberation and the ASR system. In these ASR experiments using the proposed framework, we illustrate significant performance gains over previously proposed front ends.
The second part of the thesis deals with the FDLP based speech dereverberation for enhancement applications, where the goal is to restore the audible quality of the speech signal. For this task, we decompose the sub-band speech signal into the constituent envelope and carrier part. A dereverberation neural model is designed that attempts to enhance the envelope and carrier signals jointly. Further, joint learning of the speech enhancement model with the end-to-end ASR model is proposed with a single neural framework. The proposed model therefore can generate improved audio quality and provide robust representations for far-field ASR. Finally, I will illustrate the subjective quality improvement of the audio signal as well as the improvement in ASR performance obtained by the proposed envelope-carrier model.
This work was partly supported by project grants from Samsung Research India, Bangalore and the College of Engineering, Trivandrum, Kerala.
Bio: Mr. Anurenjan is a PhD student at the LEAP lab, Electrical Engineering, IISc. He is also currently working as Assistant Professor in College of Engineering, Trivandrum. Mr. Anurenjan completed his Bachelors in Technology from Government Engineering College, Barton Hill, Trivandrum, Kerala in 2006 and his Masters in Technology from College of Engineering, Trivandrum, Kerala in 2008. He joined the LEAP lab as a PhD candidate under AICTE-QIP program in the year 2017. He hails from Trivandrum district of Kerala. He is interested in signal processing, machine learning and speech processing. Mr. Anurenjan is a student member of IEEE SPS and the ISCA. During his free hours, Mr. Anurenjan likes to play badminton and swimming.
All are invited. Coffee/Tea will be served before the talk.