ASR 2018-19  |  News Archive  |  Lectures  |  Labs  |  Coursework  |  Piazza

Lecture 15 - End-to-end systems 1: CTC

End-to-end systems are systems which learn to directly map from an input sequence X to an output sequence Y. Sequence trained HMMs (using either a maximum likelihood or discriminative objective function) are a kind of end-to-end system involving a sequence-trained acoustic model and a language model. However, end-to-end systems usually refer to neural network approaches which directly map input sequences to output sequences, in the purest case not using a separate language model or lexicon. There are two main approaches:

In this lecture we looked at CTC-based systems, and look at encoder-decoder systems in the next lecture. Probably the best reading on this is the paper about the DeepSpeech system by Hannun et al. And for an in-depth tutorial about CTC, look at the Distill article also by Hannun.

CTC

Deep Speech


Copyright (c) University of Edinburgh 2015-2019
The ASR course material is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License
licence.txt
This page maintained by Steve Renals.
Last updated: 2019/04/24 17:11:26UTC


Home : Teaching : Courses : Asr 

Informatics Forum, 10 Crichton Street, Edinburgh, EH8 9AB, Scotland, UK
Tel: +44 131 651 5661, Fax: +44 131 651 1426, E-mail: school-office@inf.ed.ac.uk
Please contact our webadmin with any comments or corrections. Logging and Cookies
Unless explicitly stated otherwise, all material is copyright © The University of Edinburgh