Automatic Speech Recognition (ASR): 2013/14

[course descriptor]

Lecturers


News

Reading

Textbook

Review and Tutorial Articles

Syllabus 2013/14

Lecture No.DateWeekLecturerTopic and slidesReading
1Mon 13 January 1RenalsIntroduction to Speech Recognition (slides) J&M: chapter 7, chapter 9 (9.1 - 9.3)
R&H review chapter
2Thu 16 January 1ShimodairaSpeech Signal Analysis 1 (slides) J&M: Sec 9.3
Taylor, chapters 10, 12
3Mon 20 January 2ShimodairaSpeech Signal Analysis 2 Hermansky (1990), PLP analysis of speech
4 Thu 23 January 2 ShimodairaAcoustic modelling basics: HMMs and GMMs 1 (slides) J&M: Secs 6.1-6.5, 9.2, 9.4
G&Y review
R&H review chapter
Rabiner & Juang (1986) Tutorial
5Mon 27 January 3ShimodairaAcoustic modelling basics: HMMs and GMMs 2
6Thu 30 January 3RenalsContext-dependent phone modelling with HMMs 1 (slides) Young (2008)
Lee (1990) Context-dependent phonetic hidden Markov models for speaker-independent continuous speech recognition
7Mon 3 February 4RenalsContext-dependent phone modelling with HMMs 2 Young & Woodland (1994) State clustering in hidden Markov model-based continuous speech recognition
Young et al (1994). Tree-based state tying for high accuracy acoustic modelling,
Thu 6 February 4ShimodairaIntroduction to Assignment 1 Assignment 1: continuous speech recognition
Thu 6 February 4Lab session (17:00)
8Mon 10 February 5RenalsLexicon and language model (slides) J&M, Ch 4
Manning & Schutze, Ch 6
Mon 10 February 5Lab session (17:00)
9Thu 13 February 5ShimodairaSearch and decoding (slides) Aubert (2002) An overview of decoding techniques for large vocabulary continuous speech recognition
Thu 13 February 5Lab session (17:00)
Mon 17 February 6 No Lecture - Innovative Learning Week
Thu 20 February 6 No Lecture - Innovative Learning Week
10Mon 24 February 7Renals Intro to neural networks (slides) Multi-layer neural networks
Morgan & Bourlard (1995), Continuous speech recognition: An introduction to the hybrid HMM/connectionist approach
Mon 24 February 7Lab session (17:30)
Wed 26 February 7Assignment 1 Deadline (16:00)
11Thu 27 February 7Renals(Deep) neural network acoustic models (slides) Hinton et al (2012), Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups
Mon 3 March 8RenalsIntroduction to Assignment 2 Assignment 2: literature review
12Thu 6 March 8Renals Neural network language models (slides)Bengio et al (2006), Neural probabilistic language models(Secs 6.1, 6.2, 6.3, 6.7, 6.8)
Mikolov et al (2011), Extensions of recurrent neural network language model
13Mon 10 March 9RenalsSpeaker adaptation 1 (slides) G&Y review, sec. 5
Woodland (2001), Speaker adaptation for continuous density HMMs: A review
14Thu 13 March 9RenalsSpeaker adaptation 2
15Mon 17 March 10RenalsDiscriminative training of GMM-based systems (slides) Young (2008), sec 27.3.1
Wed 19 March 10Assignment 2 Deadline (16:00)
16Thu 20 March 10Case study: transcribing TED talks (slides)

Schedule

Closer to the exam we are very happy to arrange a revision lecture at a time convenient to everyone. The point of this lecture will be to answer and discuss any questions about the course.

Coursework

There are two pieces of coursework.

  1. Assignment 1: continuous speech recognition - monophone and triphone models. The coursework will involve training and testing a continuous speech recognition system using the HTK software. We'll use the WSJCAM0 database (British English recordings of speakers reading the Wall Street Journal sentences).
    Released: Monday 3 February 2014
    Deadline: Wednesday 26 February 2014, 16:00
    Feedback: Wednesday 12 March 2014
    Report templates: Q and A
  2. Assignment 2: literature review. The key papers are: Released: Monday 3 March 2014
    Deadline: Wednesday 19 March 2014, 16:00
    Feedback: Wednesday 2 April 2014
    Report templates:

 
 
 


Home : Teaching : Courses 

Informatics Forum, 10 Crichton Street, Edinburgh, EH8 9AB, Scotland, UK
Tel: +44 131 651 5661, Fax: +44 131 651 1426, E-mail: school-office@inf.ed.ac.uk
Please contact our webadmin with any comments or corrections. Logging and Cookies
Unless explicitly stated otherwise, all material is copyright © The University of Edinburgh