Lecture 13 - Decoding and WFSTs

In this lecture we discussed search and decoding - and, in particular, the weighted finite state transducer (WFST) framework.

Search and decoding

The search problem in ASR: finding the most likely word sequence, given the observed acoustics
Viterbi decoding applied to connected word recognition, using a bigram language model
Computational issues in search and approaches used to address them (e.g. beam search)

WFSTs

Defining WFSTs: states connected by transitions with input label, output label, and weight - examples for language model and prounciation lexicon
Algorithms on WFSTs: Composition, Determinisation, and Minimisation
Applying WFSTs to speech recognition - HCLG, where composition of grammar (G), lexicon (L), context-dependence (C), and HMM (H) transducers
Combined HCLG transducer gives an complete search graph for an ASR system - naive composition can blow up, need to apply determinisation and minimisation multiple times during the composition, in a careful order

This page maintained by Steve Renals.
Last updated: 2017/03/19 12:41:55UTC

Home : Teaching : Courses : Asr

Informatics Forum, 10 Crichton Street, Edinburgh, EH8 9AB, Scotland, UK
Tel: +44 131 651 5661, Fax: +44 131 651 1426, E-mail: school-office@inf.ed.ac.uk
Please contact our webadmin with any comments or corrections. Logging and Cookies
Unless explicitly stated otherwise, all material is copyright © The University of Edinburgh