ASR 2018-19
| News Archive
| Lectures
| Labs
| Coursework
| Piazza
Lecture 12 - Lattice-free MMI training
This is was guest lecture by Peter Bell, explaining the lattice-free MMI discriminative training approach which is currently state-of-the-art. There are no good tutorial papers on this, probably the paper to read is Hadan et al (2018), Flat-start single-stage discriminatively trained HMM-based models for ASR.
MMI Training
- Motivation:
Maximum likelihood (ML) is theoretically optimal (even for classification), but only when the model is correct. If the model is incorrecy, then an explicitly discriminative training criterion might be better
- Example of the deficiencies of maximum likelihood:
Consider the classification problem of slides 9-12. This can be well-modelled using full covariance Gaussians for each class, trained by ML. However the decision boundaries using diagonal covariance Gaussians trained by ML result in classification errors; training the same diagonal covariance Gaussians using MMI results in better decision boundaries with fewer classification errors.
- MMI training:
As explained in the last lecture MMI involves computing the clamped numerator term and the free denominator term using the forward-backward algorithm to estimate state occupation probabilities. In practice these terms are compute using lattices. Lattice generation is expensive since it involves a recognition run.
Lattice-free MMI
- Overview:
LF-MMI enables sequence-level HMM state posteriors to be estimated using DNN acoustic model.
- Key aspects of LF-MMI:
- Represent state sequences for numerator and denominator as HCLG WFSTs
- Parallelise computation on GPU
- Use a 4-gram phone LM (rather than a word LM) in the denominator
- Reduced frame rate, simpler context-dependent phone HMM topology (single state)
- Regularize using multi-task training (simulataneously optimise sequential MMI objective, and frame-wise cross-entropy objective)
- LF-MMI in practice:
LF-MMI offers based increased accuracy and faster training for HMM/TDNN systems compared to both cross-entropy (framewise) training and lattice-based sequence training.
Copyright (c) University of Edinburgh 2015-2019
The ASR course material is licensed under the
Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License
licence.txt
This page maintained by Steve Renals.
Last updated: 2019/04/26 17:27:18UTC