Lecture 13 - Multilingual speech recognition
This lecture covered multilingual and low-resource speech recognition - how to develop speech recognition systems for a diversity of languages, most of which have very limited resources.
-
Introduction to the problem of multilingual speech technology - scale of the problem (many languages in the world), the problem of low levels of linguistic resources for most languages.
-
Multilingual and cross-lingual acoustic models - using hidden layers to learn multilingual representations
- Hat-swap - sequentially train a network on different languages, swapping in a language-specific output layer for each language, meaning the hidden layers are trained multilingually, but the output layer is language specific.
- Block softmax - train a network with multiple output layers (one for each language), but shared hidden layers (like a "parallel hat swap"). At training time only concerned with the output layer for the language of the current example.
- Multilingual bottleneck features - training a bottleneck neural network with a shared multilingual bottleneck hidden units (with language specific output layers) for use as additional features in GMM or neural network systems
-
Grapheme-based lexicon
-
Morphs and words
-
Morphological rich languages and vocabulary size
-
Morfessor - unsupervised data-driven method for the segmentation of words into morpheme-like units
-
Morph-based language models
Copyright (c) University of Edinburgh 2015-2018
The ASR course material is licensed under the
Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License
licence.txt
This page maintained by Steve Renals.
Last updated: 2018/04/30 20:59:23UTC