Automatic Speech Recognition (ASR): 2016/17

Review and Tutorial Articles

Syllabus 2015/16

Lecture No.DateWeekLecturerTopic and slidesReading
1Mon 16 JanuaryWk1Renals Introduction to Speech Recognition (slides; lecture recording) J&M: chapter 7, chapter 9 (9.1 - 9.3)
R&H review chapter
2Thu 19 January Wk1Shimodaira Speech Signal Analysis 1 (slides; additional-notes; lecture recording) J&M: Sec 9.3
3Mon 23 January Wk2Shimodaira Speech Signal Analysis 2 (lecture recording) Hermansky (1990), PLP analysis of speech
4Thu 26 January Wk2Shimodaira Acoustic modelling: HMMs and GMMs 1 (slides; additional notes; lecture recording) J&M: Secs 6.1-6.5, 9.2, 9.4
G&Y review
R&H review chapter
Rabiner & Juang (1986) Tutorial
5Mon 30 January Wk3Shimodaira Acoustic modelling: HMMs and GMMs 2 (lecture recording)
6Thu 2 February Wk3Shimodaira Acoustic modelling: Context-dependent phone modelling (slides; lecture recording) Young (2008)
-Mon 6 February Wk4 NO LECTURE
7Thu 9 February Wk4Renals Introduction to neural networks (slides; lecture recording) M Nielsen (2014), Neural networks and deep learning
8Mon 13 February Wk5Renals Neural network acoustic models 1 (slides; lecture recording) Morgan and Bourlard (1995), Continuous speech recognition: Introduction to the hybrid HMM/connectionist approach
9Thu 16 February Wk5Renals Neural network acoustic models 2; Sequence discriminative training (slides) Hinton et al (2012), Deep neural networks for acoustic modeling in speech recognition
Vesely et al (2013), Sequence-discriminative training of deep neural networks
Mon 20 February No Lectures or Labs - Flexible Learning Week
Thu 23 February No Lectures or Labs - Flexible Learning Week
10Mon 27 February Wk6Renals Speaker adaptation (slides) G&Y review, sec. 5
Woodland (2001), Speaker adaptation for continuous density HMMs: A review
Swietojanski et al (2016), Learning Hidden Unit Contributions for Unsupervised Acoustic Model Adaptation
11Thu 2 March Wk6Renals Language models 1 (slides) J&M, Ch 4
Manning & Schutze, Ch 6
Bengio et al (2006), Neural probabilistic language models(Secs 6.1, 6.2, 6.3, 6.7, 6.8)
12Mon 6 March Wk7Renals Language models 2 Mikolov et al (2011), Extensions of recurrent neural network language model
Chen et al (2015), Recurrent neural network language model training with noise contrastive estimation for speech recognition, ICASSP-2015.
Jozefowicz et al (2016), Exploring the Limits of Language Modeling, arXiv:1602.02410.
13Thu 9 March Wk7Renals WFSTs
14Mon 13 March Wk8Renals Multilingual systems
15Thu 16 March Wk8Renals Robust speech recognition
16Mon 20 March Wk9 Guest lecture: recognition of multi-genre broadcasts
17Thu 23 March Wk9 Current progress in acoustic modelling
18Mon 27 March Wk10 End-to-end systems
18Thu 30 March Wk10 NN generative models and WaveNet
The coursework will concern a continuous speech recognition and will use Kaldi. It will build on the labs.
Released: Monday 13 February 2017. [Assignment]
Deadline: Wednesday 8 March 2017, 16:00
Feedback: Wednesday 22 March 2017

Software tools


