Lecture 10 - Lexicon and Language Model
The ASR course focused on acoustic modelling, rather than language, largely because most of the modern techniques for language modelling are covered in the Natural Language Understanding and Machine Translation courses. However, in this lecture we give a an overview of issues relating to pronunciation modelling and the ASR lexicon, as well as the basics of language modelling.
- Pronunciation dictionaries - which connect words to the phone sequences that correspond to their pronunciation - are often written by human experts.
- An important issue is selecting which words ho in the dictionary - the coverage of a dictionary is measured by the out-of-vocabulary (OOV) rate, the percentage of word tokens in a test set which do not have pronunciations in the pronunciation dictionary.
- Each OOV word in a test set results in 1.5-2 errors
- The lexicon can be more complex in many languages which are morphologically richer than English - compounding and inflection can greatly increase the vocabulary size
- Although many words can bbe pronounced in multiple ways, state-of-the-art systems tend not to have a high rate of multiple proniciations, on average 1.1 pronunciations per word - there is a tradeoff between flexibility and confusion
- Consistency vs fidelity: most important to have a systematic set of pronunciations, even if some alternate pronunciations are not included. It then becomes the job of the acoustic model to learn the acoustic variabilities arising from different pronunciations.
- A quick overview of n-gram language modelling: using n-grams (word probabilities given the preceding n-1 words) to model word sequences; the zero probability problem and the need to smooth or discount raw relative frequencies obtained from counting, in order to reliably estimate n-gram probabilities.
- Basic introduction to neural network language models.
- Note that n-grams are used in deployed ASR systems, not least for efficiency reasons.
Copyright (c) University of Edinburgh 2015-2018
The ASR course material is licensed under the
Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License
licence.txt
This page maintained by Steve Renals.
Last updated: 2018/04/30 20:59:23UTC