================ General ========================= Q: Which measure should I use, Accuracy or WER? A: Either is fine (as they carry the same information), but use one and do no mix. Q: How is Accuracy calculated? A: See Section 17.19.1 of the HTK manual. Q: I've got a message of "No token survived". A: This means HVite failed to find the best path, and it does not give any recognition output for that input. The problem is that HResults ignore the corresponding recognition output from the analysis, and it does not give accurate statistics as a result. This does not necessary mean you should always discard the recognition experiment - it would be ok if the number of the errors is very small (e.g. 1) as long as you understand the risk described below. It would be a good idea to know how much amount of influence the error could have on the word accuracy. The following is a simulation. We have two test data sets, "si_dt5a.scp" (368 utterances) and "si_dt5a-div3.scp" (123 utterances) which is called "small". Assuming that N utterances were not recognised because of the error and the rest were recognised without errors, HResults will will give you the accuracy of 100%. The actual (worst) accuracy when the unrecognised utterances are taken into account is given as follows: test set \ N | 1 5 10 ---------------------------------- si_dt5a | 99.7 98.6 97.2 si_dt5a-div3 | 99.2 95.9 91.9 ---------------------------------- (where I assume all the utterances have the same length (in the number of words)) Q: I got an error message something like "models/R9/hmm11/MODELS" already exists. A: That was caused because your training script tried to overwrite the existing model. As a result, training is unfinished. There will be two options to resolve this. Option A: remove the existing model and its directory. Option B: Rename the directory of either the existing model or the new one. Q: How can I know the likelihood on the training data? A: You can find it in the corresponding log file of your training script. Log files can be found in the "logs" directory in your work directory. For example, logs/train/R1/log-5, in which you will find the information something like this: "Reestimation complete - average log prob per frame = -1.058707e+02" Q: How can I try different feature vectors? A1: Generally speaking, you need to calculate new feature vectors from speech wave files (See A2). However, you do not need to do so as long as you just want to try MFCC features (e.g. MFCC_0) without delta features, because it's a subset of the original MFCC features (i.e. MFCC_0_D_A_Z) provided in the course directory. If it is the case, what you need to do will be to (1) create a new work directory for the new experiment, (2) modify the prototype files (in "proto" directory) [*2], and (3) configs/config.basic to reflect the MFCC features you want to use [*1]. [*1] You cannot edit/modify the original files in "proto" or "configs" directory as they are symbolic-linked to other files whose owner is not you. An easy solution for this will be to rename them to "proto.org" and "configs.org" and create new "proto" and "configs" directories, to which you copy the original files. [*2] For modifying prototype files in "proto", there will be two options - (i) modify each prototype file with a text editor manually, or (ii) use "MakeProtoHMMSet" in "scripts" directory to create new prototypes automatically if you understand how to use the command. A2: Creating new feature vectors from wave files: The sample scripts that were used to obtain the current data can be found in Org/scripts directory. run_wave2featurevectors.sh wave2featurevectors.sh In addition to some modifications you need to make for the files above, you need to modify corresponding configuration files in "configs" directory. After you have managed to get new feature vectors, please follow the instructions in A1 above. ================= Monophone models ======================= ================= Triphone models ======================= Q: How can I obtain the number of clusters? A: See logs/R9/log-hhed Q: How can I change the number of tied-state triphone models? A: You cannot specify the number directly, but you can control it with the "TB" value. See Section 17.8.1 of the HTK manual for details. Q: What range of TB should I try? A: One extreme case would be a TB which results in a similar number of clusters to the one of monophone models. (NB: clustering is done state wise rather than phone model wise) Q: In decision tree-based tied-state triphone models, how can I write a script that carries out sets of experiments (clustering, training, and recognition) for different clustering thresholds? A: See ShellScriptExamples/ex-tied-triphones.sh as an example.