E9 261 (JAN) 3:1 Speech Information Processing



Speech Information Processing
January-April, 2024

Announcements:
January 1, 2024: First lecture will be held on January 3, 2024 (Wednesday) at 4:30PM in EE B 3038.
January 1, 2024: If you are attending the course (credit or audit), please fill up this form (on or before January 19, 2024) to join the class Team.


Instructor:
Prasanta Kumar Ghosh
Office: EE C 330
Phone: +91 (80) 2293 2694
prasantg AT iisc.ac.in


Teaching Assistant(s):
    TBD


Class meetings:
4:30pm to 5:30pm on every Monday, Wednesday and Friday (in EE B 308)


Course Content:
  • Speech communication and overview
  • Time varying signals/sys
  • Spectrograms and applications
  • Speech parameterization/representation
  • AM-FM, sinusoidal models for speech
  • Linear Prediction, AR and ARMA modeling of speech.
  • Sequence Modeling of Speech - Dynamic Time Warping, Introduction to Hidden Markov Models
  • Deep learning for Sequence Modeling - Recurrent neural networks, attention based models.
  • Speech applications - Automatic speech recognition.


Prerequisites:
Digital Signal Processing, Probability and Random Processes


Textbooks:
    • Fundamentals of speech recognition, Rabiner and Juang, Prentice Hall, 1993.
    • Automatic Speech Recognition, A Deep Learning Approach, Authors: Yu, Dong, Deng, Li, Springer, 2014.
    • Discrete-Time Speech Signal Processing: Principles and Practice, Thomas F. Quatieri, Prentice Hall, 2001.
    • Digital Processing of Speech Signals, Lawrence R. Rabiner, Pearson Education, 2008.
    • "Automatic Speech Recognition - A deep learning approach" - Dong Yu, Li Deng.


Web Links:
The Edinburgh Speech Tools Library
Speech Signal Processing Toolkit (SPTK)
Hidden Markov Model Toolkit (HTK)
ICSI Speech Group Tools
VOICEBOX: Speech Processing Toolbox for MATLAB
Praat: doing phonetics by computer
Audacity
SoX - Sound eXchange
HMM-based Speech Synthesis System (HTS)
International Phonetic Association (IPA)
Type IPA phonetic symbols
CMU dictionary
Co-articulation and phonology by Ohala
Assisted Listening Using a Headset
Headphone-Based Spatial Sound
Pitch Perception
Head-Related Transfer Functions and Virtual Auditory Display
Signal reconstruction from STFT magnitude: a state of the art
On the usefulness of STFT phase spectrum in human listening tests
Experimental comparison between stationary and nonstationary formulations of linear prediction applied to voiced speech analysis
A modified autocorrelation method of linear prediction for pitch-synchronous analysis of voiced speech
Linear prediction: A tutorial review
Energy separation in signal modulations with application to speech analysis
Nonlinear Speech Modeling and Applications


Grading:
  • Assignments including recording (10 points) - Average of all assignments will be considered. Assignments will include associated recordings. Cheating or violating academic integrity (see below) will result in failing in the course. Turning in identical homework sets counts as cheating.
  • Midterm exam. (20 points) - 2 midterm exams. Missed exams earn 0 points. No make-up exams. An average of the midterm scores will be considered.
  • Surprise exam. (20 points) - 8 surprise exams. Closed book. 10 minutes per exam. Each surprise exam is worth 4 points. Missed exams earn 0 points. No make-up exams. The total surprise exam score sums your 5 best surprise exam scores (we ignore three worst scores). Class attendance is mandatory. Unexcused absences get an automatic exam score of zero for that session's exam grade.
  • Final exam. (35 points)
  • Project (15 points) - Quality/Quantity of work (5 points), Report (5 points), Presentation (5 points).


Topics covered:
Date
Topics
Remarks
Jan 3
Course logistics, Information in speech, speech chain, speech research - science and technology
-










Your Voice/files to upload:
Click to upload (max 10Mb)


Transcripts for recording:
Click here


Your feedback on this course (any time):
Click here


Academic Honesty:
As students of IISc, we expect you to adhere to the highest standards of academic honesty and integrity.
Please read the IISc academic integrity.