BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//EE - ECPv5.10.0//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-WR-CALNAME:EE
X-ORIGINAL-URL:https://ee.iisc.ac.in
X-WR-CALDESC:Events for EE
BEGIN:VTIMEZONE
TZID:Asia/Kolkata
BEGIN:STANDARD
TZOFFSETFROM:+0530
TZOFFSETTO:+0530
TZNAME:IST
DTSTART:20220101T000000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=Asia/Kolkata:20220909T200000
DTEND;TZID=Asia/Kolkata:20220909T210000
DTSTAMP:20260630T213150
CREATED:20220905T052514Z
LAST-MODIFIED:20220905T052514Z
UID:239884-1662753600-1662757200@ee.iisc.ac.in
SUMMARY:Thesis Colloquium of Anurenjan P. R.
DESCRIPTION:Venue\, Date and Time: MMCR\, EE\, IISc. 9-9-2022\, 2.30-3.30pm.\n\n\nTeams link – https://tinyurl.com/3bpmjxkx . \nTitle: Dereverberation of speech using frequency domain linear prediction \nFaculty Advisor: Dr. Sriram Ganapathy \nAbstract : The speech-based technologies are radically changing the way we interact with systems and how we access information. In many of these applications\, the users prefer to interact with the system through a far-field microphone without the nuance of a handheld or body-worn device. Examples of such applications are automated meeting analysis\, speech-based dictation systems\, hands-free interfaces for controlling consumer-products\, IoT\, virtual assistants in mobile phones and smart speakers. The major challenge in capturing speech from the far-field is the degradation of the signal quality due to reverberation. Reverberation refers to the delayed and weighted summation of the direct component of the speech signal with the reflected versions. This talk is focused on developing methods for speech dereverberation\, i.e.\, restoring the functional quality of reverberated speech\, using the signal analysis technique of frequency domain linear prediction (FDLP). \nThe FDLP is the frequency domain dual of the conventional Time Domain Linear Prediction (TDLP). Just as the TDLP estimates the spectrum of a signal\, the FDLP estimates the temporal envelopes of the signal using an autoregressive model. We apply the FDLP approach to the sub-bands of speech signal that are distributed in the mel scale. \nThis talk will describe two broad directions for addressing issues in the far-field speech using the FDLP approach. In the first part of the talk\, we explore a front-end design for automatic speech recognition (ASR) applications that suppresses the reverberation artifacts in the FDLP envelope. In the second part of the thesis\, we develop a speech enhancement model using the envelope and carrier decomposition given by the FDLP technique. \nIn the design of the ASR front end\, I will discuss a novel approach for 3-D acoustic modeling framework\, where the spatio-spectral features from all the sub-band channels are extracted. The features that are input to the 3-D CNN are extracted by modeling the signal peaks in the spatio-spectral domain using a multi-variate autoregressive modeling approach. In the subsequent part of this section\, I will describe a neural model for speech dereverberation using the long-term sub-band envelopes of speech. The neural dereverberation model estimates the envelope gain\, which when applied to reverberant signals\, allows the suppression of the late reflection components. The de-reverberated envelopes are used for feature extraction in speech recognition. The key novelty in this model is the joint learning of the reverberation and the ASR system. In these ASR experiments using the proposed framework\, we illustrate significant performance gains over previously proposed front ends. \nThe second part of the thesis deals with the FDLP based speech dereverberation for enhancement applications\, where the goal is to restore the audible quality of the speech signal. For this task\, we decompose the sub-band speech signal into the constituent envelope and carrier part. A dereverberation neural model is designed that attempts to enhance the envelope and carrier signals jointly. Further\, joint learning of the speech enhancement model with the end-to-end ASR model is proposed with a single neural framework. The proposed model therefore can generate improved audio quality and provide robust representations for far-field ASR. Finally\, I will illustrate the subjective quality improvement of the audio signal as well as the improvement in ASR performance obtained by the proposed envelope-carrier model. \nAcknowledgement \nThis work was partly supported by project grants from Samsung Research India\, Bangalore and the College of Engineering\, Trivandrum\, Kerala.  \nBio: Mr. Anurenjan is a PhD student at the LEAP lab\, Electrical Engineering\, IISc. He is also currently working as Assistant Professor in College of Engineering\, Trivandrum. Mr. Anurenjan completed his Bachelors in Technology from Government Engineering College\, Barton Hill\, Trivandrum\, Kerala in 2006 and his Masters in Technology from College of Engineering\, Trivandrum\, Kerala in 2008. He joined the LEAP lab as a PhD candidate under AICTE-QIP program in the year 2017. He hails from Trivandrum district of Kerala. He is interested in signal processing\, machine learning and speech processing. Mr. Anurenjan is a student member of IEEE SPS and the ISCA. During his free hours\, Mr. Anurenjan likes to play badminton and swimming.  \n——- \nAll are invited. Coffee/Tea will be served before the talk. 
URL:https://ee.iisc.ac.in/event/thesis-colloquium-of-anurenjan-p-r/
LOCATION:EE\, MMCR
END:VEVENT
END:VCALENDAR