================ General =========================
Q: Which measure should I use, Accuracy or WER?
A: Either is fine (as they carry the same information), but use one
   and do no mix. 

Q: How is Accuracy calculated?
A: See Section 17.19.1 of the HTK manual.

Q: I've got a message of "No token survived".
A: This means HVite failed to find the best path, and it does not give
   any recognition output for that input. The problem is that HResults
   ignore the corresponding recognition output from the analysis, 
   and it does not give accurate statistics as a result.

   This does not necessary mean you should always discard the
   recognition experiment - it would be ok if the number of the errors
   is very small (e.g. 1) as long as you understand the risk described
   below. 

   It would be a good idea to know how much amount of influence the error
   could have on the word accuracy. The following is a simulation.
   We have two test data sets, "si_dt5a.scp" (368 utterances) and
   "si_dt5a-div3.scp" (123 utterances) which is called "small".
   Assuming that N utterances were not recognised because of the
   error and the rest were recognised without errors, HResults will
   will give you the accuracy of 100%. The actual (worst) accuracy
   when the unrecognised utterances are taken into account is given as
   follows:

       test set \ N |     1     5    10
       ----------------------------------
       si_dt5a      |  99.7  98.6  97.2
       si_dt5a-div3 |  99.2  95.9  91.9
       ----------------------------------

    (where I assume all the utterances have the same length (in the
    number of words))


Q: How can I try experiments using different feature vectors?
A: You need to calculate new feature vectors from speech wave files.
   Sample scripts that were used to obtain the current data can be
   found in Org/scripts directory. They are
   	 run_wave2featurevectors.sh
	 wave2featurevectors.sh   	 

Q: I got an error message something like "models/R9/hmm11/MODELS"
   already exists.
A: That was caused because your training script tried to overwrite the
   existing model. As a result, training is unfinished.
   There will be two options to resolve this.
     Option A: remove the existing model and its directory.
     Option B: Rename the directory of either the existing model or the
               new one.

================= Monophone models =======================


================= Triphone models =======================

Q: How can I obtain the number of clusters?
A: See logs/R9/log-hhed

Q: How can I change the number of tied-state triphone models?
A: You cannot specify the number directly, but you can control it
   with the "TB" value. See Section 17.8.1 of the HTK manual for details.

Q: What range of TB should I try?
A: One extreme case would be a TB which results in a similar
   number of clusters to the one of monophone models.
   (NB: clustering is done state wise rather than phone model wise)

Q: In decision tree-based tied-state triphone models,  how can I
   write a script that carries out sets of experiments (clustering,
   training, and recognition) for different clustering thresholds? (21/Mar)
A: See ShellScriptExamples/ex-tied-triphones.sh as an example.


================= MLLR-based speaker adaptation =========

Q: Should the number of regression classes be the power of 2? 
A: No, it can be a any natural number. This is because clustering
   proceeds by splitting a node into two, meaning each splitting
   increases the number of leaf nodes by 1.
   (See HTK Manual section 9.1.4 and 10.7)

Q: I set the number of regression classes to 8, only to find 4
   classes. Why?
A: It suggests that only 4 leaf nodes (clusters) got sufficient
   amount of data, and the others did not. 
   (See HTK Manual section 9.1.4 and 10.7)
   If this is the case, you will need to use more complex HMMs (
   e.g. those with more Gaussian mixtures, those with more contexts)
   to carry out an experiment in which larger numbers of regression
   classes are tried.

   * The same thing can happen when you try to increase the number
   of Gaussian mixture components.

Q: In MLLR speaker adaptation, how can I change the size of
   adaptation data?
A: Use a subset of adaptation data defined in file_lists/dev-01.scp.
   For this, create another list, e.g. file_lists/dev-01-subset.scp
   into which you copy a subset of file_lists/dev-01.scp.
   Recall speaker number is included in the data file name.
   For example, it is "c31" for c3la010b.mfc.
   In file_lists/dev-01.scp, there are 20 speakers, each of which
   gives 18 utterances.