- This event has passed.
MTech Research Thesis Defense of Mr. Jaswanth Reddy Katthi @ 11am
April 18, 2022 @ 4:30 PM - 5:30 PM IST
Location : Electrical Engineering, MMCR (C241), Online via Teams (if network connection allows) https://tinyurl.com/2p8exxys
Title : Deep Learning Methods for Audio-EEG Analysis
Abstract : The perception of speech and audio is one of the defining features of humans. Much of the brain’s underlying processes, as we perceive acoustic signals, are unknown, and significant research efforts are needed to unravel them. The non-invasive recordings capturing the brain activations like electroencephalogram (EEG) and magnetoencephalogram (MEG) are commonly deployed to capture the brain responses to auditory stimuli. But these non-invasive techniques capture artifacts and noise that are not related to the stimuli, which distort the downstream stimulus-response analysis. The current state-of-art models used for normalization and pre-processing of EEG data utilize the linear canonical correlation analysis (CCA) or the temporal response function (TRF) based approach. However, these methods assume a simplistic linear relationship between the audio features and the EEG responses and therefore, may not alleviate the recording artifacts and interfering signals in EEG. This talk proposes novel methods using deep learning advances to improve the audio-EEG analysis.
We propose a deep learning framework for audio-EEG analysis in intra-subject and inter-subject settings. The deep learning based intra-subject analysis methods are trained with a Pearson correlation-based cost function between the stimuli and EEG responses. This model allows the transformation of the audio and EEG features in a common sub-space that is maximally correlated. The correlation-based cost function can be optimized with the learnable parameters of the model trained using standard gradient-descent based methods. This model is referred to as the deep CCA (DCCA) model. Several experiments, performed on the EEG data recorded on subjects listening to naturalistic speech and music stimuli, show that the deep methods obtain improved representations than the linear methods, thereby resulting in statistically significant improvements in correlation values.
Further, we propose a neural network model with shared encoders that align the EEG responses from multiple subjects listening to the same audio stimuli. This inter-subject model boosts the signals common across the subjects related to the stimuli and suppresses the subject-specific artifacts. This model is referred to as the deep multi-way canonical correlation analysis (DMCCA). The combination of inter-subject analysis using DMCCA and intra-subject analysis using DCCA is shown to provide the best stimulus-response in audio-EEG experiments.
Finally, the talk will discuss about an ambitious experiment, where we attempted to recreate acoustic signal directly from EEG responses. While the audio is not fully recoverable, the parts of the signal that can be recovered from the non-invasive EEG recordings throws light into the characteristics of audio captured in the EEG data.