================ General =========================
Q: Which measure should I use, Accuracy or WER?
A: Either is fine (as they carry the same information), but use one
   and do no mix. 

Q: How is Accuracy calculated?
A: See Section 17.19.1 of the HTK manual.

Q: I've got a message of "No token survived".
A: This means HVite failed to find the best path, and it does not give
   any recognition output for that input. The problem is that HResults
   ignore the corresponding recognition output from the analysis, 
   and it does not give accurate statistics as a result.

   This does not necessary mean you should always discard the
   recognition experiment - it would be ok if the number of the errors
   is very small (e.g. 1) as long as you understand the risk described
   below. 

   It would be a good idea to know how much amount of influence the error
   could have on the word accuracy. The following is a simulation.
   We have two test data sets, "si_dt5a.scp" (368 utterances) and
   "si_dt5a-div3.scp" (123 utterances) which is called "small".
   Assuming that N utterances were not recognised because of the
   error and the rest were recognised without errors, HResults will
   will give you the accuracy of 100%. The actual (worst) accuracy
   when the unrecognised utterances are taken into account is given as
   follows:

       test set \ N |     1     5    10
       ----------------------------------
       si_dt5a      |  99.7  98.6  97.2
       si_dt5a-div3 |  99.2  95.9  91.9
       ----------------------------------

    (where I assume all the utterances have the same length (in the
    number of words))


Q: I got an error message something like "models/R9/hmm11/MODELS"
   already exists.
A: That was caused because your training script tried to overwrite the
   existing model. As a result, training is unfinished.
   There will be two options to resolve this.
     Option A: remove the existing model and its directory.
     Option B: Rename the directory of either the existing model or the
               new one.

Q: How can I know the likelihood on the training data?
A: You can find it in the corresponding log file of your training script.
   Log files can be found in the "logs" directory in your work directory.
   For example, logs/train/R1/log-5, in which you will find the
   information something like this:
     "Reestimation complete - average log prob per frame = -1.058707e+02"

Q: How can I try different feature vectors?
A1: Generally speaking, you need to calculate new feature vectors from 
   speech wave files (See A2). However, you do not need to do so as
   long as you just want to try MFCC features (e.g. MFCC_0) without
   delta features, because it's a subset of the original MFCC features
   (i.e. MFCC_0_D_A_Z) provided in the course directory. If it is the
   case, what you need to do will be to (1) create a new work
   directory for the new experiment, (2) modify the prototype files
   (in "proto" directory) [*2], and (3) configs/config.basic to
   reflect the MFCC features you want to use [*1]. 

   [*1] You cannot edit/modify the original files in "proto" or "configs"
   directory as they are symbolic-linked to other files whose owner
   is not you. An easy solution for this will be to rename them to
   "proto.org" and "configs.org" and create new "proto" and "configs"
   directories, to which you copy the original files.
   [*2] For modifying prototype files in "proto", there will be two
   options - (i) modify each prototype file with a text editor manually,
   or (ii) use "MakeProtoHMMSet" in "scripts" directory to create new
   prototypes automatically if you understand how to use the command.

A2: Creating new feature vectors from wave files:
   The sample scripts that were used to obtain the current data can be
   found in Org/scripts directory.
   	 run_wave2featurevectors.sh
	 wave2featurevectors.sh   	 
   In addition to some modifications you need to make for the files
   above, you need to modify corresponding configuration files in
   "configs" directory.
   After you have managed to get new feature vectors, please follow
   the instructions in A1 above.


================= Monophone models =======================


================= Triphone models =======================

Q: How can I obtain the number of clusters?
A: See logs/R9/log-hhed

Q: How can I change the number of tied-state triphone models?
A: You cannot specify the number directly, but you can control it
   with the "TB" value. See Section 17.8.1 of the HTK manual for details.

Q: What range of TB should I try?
A: One extreme case would be a TB which results in a similar
   number of clusters to the one of monophone models.
   (NB: clustering is done state wise rather than phone model wise)

Q: In decision tree-based tied-state triphone models,  how can I
   write a script that carries out sets of experiments (clustering,
   training, and recognition) for different clustering thresholds?
A: See ShellScriptExamples/ex-tied-triphones.sh as an example.