Lecture 13 - Decoding and WFSTs
In this lecture we discussed search and decoding - and, in particular, the weighted finite state transducer (WFST) framework.
Search and decoding
-
The search problem in ASR: finding the most likely word sequence, given the observed acoustics
-
Viterbi decoding applied to connected word recognition, using a bigram language model
-
Computational issues in search and approaches used to address them (e.g. beam search)
WFSTs
-
Defining WFSTs: states connected by transitions with input label, output label, and weight - examples for language model and prounciation lexicon
-
Algorithms on WFSTs: Composition, Determinisation, and Minimisation
-
Applying WFSTs to speech recognition - HCLG, where composition of grammar (G), lexicon (L), context-dependence (C), and HMM (H) transducers
-
Combined HCLG transducer gives an complete search graph for an ASR system - naive composition can blow up, need to apply determinisation and minimisation multiple times during the composition, in a careful order
This page maintained by Steve Renals.
Last updated: 2017/03/19 12:41:55UTC