ASR coursework Q&A           Last updated: 2018/02/12 18:05:51

--- Working in pairs ---------------------------------------------------
Q: I cannot find a partner for the coursework.
A: Contact Hiroshi Shimodaira by Monday 19th February via email with
   the following information:  
    - Your name and UUN
    - Degree-programme name
    - Programming skills (especially shell scripts)
    - Any other information that you'd like to add if any

Q: Can I do my coursework alone rather working together in a pair?
A: Yes, but you should understand that the marking of coursework is
   done in  the same manner for pairs.
   If you decide to work alone, please let me (Hiroshi) know without delay.

--- Task specifications in the assignment ------------------------------
Q: Is Task 1 word-level or phone-level?
A: Word level. All the tasks in the assignment are word recognition tasks.

Q: Should we not compute normalised features for any of questions 1.1,
  1.2 or 1.3 (i.e create a cmvn.scp file from the corresponding
  feats.scp file pointing to mfcc/fbank/plp features)? 
A: I'm assuming the setting used in the original script, but either
  is fine. However it should be consistent in 1.1, 1.2 and 1.3, and
  you should (always) clarify the setting of your experiment in your report.

Q: Scripts to submit
   From the description in the assignment I'm not very clear what
   exactly the script for task 1.1 should automate. In particular,
   after looking at other Kaldi recipes should the script for task 1.1 do: 
     * Data preparation (setting up timid and wsj train and test directories)
     * Feature extraction (mfcc+cmvn for task 1.1)
   Also i have noticed that Kaldi recipes don't contain actual loops
   over all possible hyperparameter values. Thus, i assume its fine to
   run the for loop separately in the terminal? 
A: First of all, I'd like you to understand why we ask you to submit
   your code. (see sections 1 and 2 in the assignment sheet)  
   The scripts to submit should contain sufficient code for the marker
   to replicate the experiment you report in your report.
   For example, 'exp_mono_t1.sh', is supposed to contain the loops for
   all the hyperparameter values that you tried and you report in your
   report, otherwise it is not considered as an evidence of your experiment.
     If you change parameter values manually outside the script,  the
   script cannot be used as an evidence of the experiment. 
     Another reason why we ask you to submit those scripts is to
   encourage you to write scripts that automate your experiments so that
   you don't need to change parameters/conditions manually, which will
   save your time a lot. 
     On the other hand, you don't need to include the code for data
   preparation or feature extraction, as long as you use the default
   features used in lab3 (word recognition). 

Q: Gender dependent model testing
   In task 3, should the tests for gender dependent models be done on
   the provided test dataset (containing both genders) or should we
   also split the test dataset by gender and test only on the single
   gender subset. 
A: It is you who decide. My advice would be that you clarify the
   purpose of your experiment and what conclusions you'd like to make
   based on the result before you run experiments, which will give you
   ideas what you should do. 

--- Experiments with kaldi -------------------------------------------

Q: Are there any other Kaldi documentations rather than the official one?
A: Old forums: https://sourceforge.net/p/kaldi/discussion/
   New forums: http://kaldi-asr.org/forums.html
   Apart from these sources you can look at the source code itself,
   look at example recipes:
     https://github.com/kaldi-asr/kaldi/tree/master/egs	

Q: How I can find the dimension of feature vector?
A: Try the command, 'feat-to-dim'.

Q: Should I use diffident experiment directories (e.g. exp1, exp2,...) for
   different experiments?  
A: Yes, that will be safe to avoid errors / unexpected results.
   Note that you should remove unnecessary files/directories once the
   experiment finishes in order to save your disk usage.
   Instead of creating new experiment directories, you can stick to a
   single experiment directory (e.g. exp), but you should remove the
   contents of the directory before you run a new experiment.

Q: When I redo a decoding experiment with the same conditions, should
   I remove the old directory for decoding (e.g. decode_test)?
A: Yes, you should remove it - Kaldi decoding scripts use cache files,
   which could cause unexpected effects on your subsequent runs.

Q: Where can I find WERs for word recognition experiments?
A: The scores are stored in files within the decode directory that are
   for example called wer_10_0.0, wer_11_0.0, etc. The first number is
   the language model scaling factor (LMWT) and the second number is the
   word insertion penalty (WIP). If you open any of those files, e.g. 
       exp/word/mono/decode_test/wer_10_0.0
   You will see a breakdown of WER, SER and insertion, deletion and
   substitution errors. You can use utils/best_wer.sh to automatically
   find the best one amongst the files above. 

  Note that the scores we got in Lab 1 and Lab 2 were not WER, but PER
  (phone error rate) in fact, because the task was phone recognition
  rather than word recognition (which we did in Lab 3).

Q: When I was running the decode phase:
    steps/decode.sh --nj 4 exp/word/tri1/graph data/test_words exp/word/tri1/decode_test
   the terminal said:
      decode.sh: no such file exp/word/tri1/graph/HCLG.fst
   What should I do?
A: The previous command needs to have finished prior to decoding:
      utils/mkgraph.sh data/lang_wsj_test_bg exp/word/tri1 exp/word/tri1/graph
   This should create the HCLG if it finishes successfully.

Q: How many Gaussians is too many Gaussians?
A: As you increase the number of Gaussians, WER on test data will
   decrease, but it will start increasing from a certain point
   (or a region), which I would like you to find. 
   I would suggest that you change the number exponentially and draw a
   graph of WER at first, which will give you an idea of what range
   you should try. You then look into a bit more details next.

Q: How can we implement FBANK and PLP, and what dimensionality of
   feature should we use for them?
A: You can find sample scripts in the directory, 'steps'.
   You can decide the dimensionality by yourself - you could choose the
   one most commonly used, or the same one as MFCC. You should think
   about what conditions you should use if you'd like to compare.

Q: How can I change the number of Gaussian components?
   How can I pass values to scripts?

A: You can specify the total number by this way for example:
     steps/train_mono.sh --totgauss 1000
   NB: this way of passing values to a script is only possible if
    the script defines the corresponding variable and has the
    following line in it (after the declaration of the variable)
        . parse_options.sh

Q: Where can I find log likelihoods on training and test sets?
A: - For log likelihoods on a training set,
    The original information can be found in align.*.*.log in the log
  directory, but it is for each job (data partition) when you use
  multiple jobs (normally nj==4). The summary (average log-likelihood) is
  shown in the output of the training script (e.g. train_mono.sh and
  train_deltas.sh), e.g.:
     exp/word/tri: nj=4 align prob=-52.66 over 3.12h ....
   (where "-52.66" is the average log-likelihood)
  The same information can be obtained by running
     steps/info/gmm_dir_info.pl $dir
  (where $dir is the directory for experiment)   
  - For log likelihoods on a test set,
    The original information for each job can be found in decode.JOB.log
    in decode_test/log, where JOB=1,...,nj. For example, there is a line:
      LOG (gmm-latgen-faster:main():gmm-latgen-faster.cc:181) Overall log-likelihood per frame is -8.80672 over 15130 frames.    
    The average log-likelihood over all the jobs should be obtained
    as a weighted sum to reflect the different number of frames in
    each job. If you have no idea of this weighted sum, it is ok that
    you just use the log-likelihood for a specific job (e.g. 1) or
    you just calculate the simple arithmetic average over the jobs.
    In either case, you should clarify the approach taken in your report.

--- Computing systems/environment ---------------------------------
Q: Saving scripts in steps
   When I try to save a new script in the steps folder, I get a
   "Permission denied" error. Is there any way around this other than
   putting the scripts in an unlocked directory? 
A: The steps, utils and local folders are shared among all students so
   that we can make updates if needed, and you have no write
   permissions for them. 
   For your needs I would make another directory, for example
   my-local, in which to store your scripts. 

Q: When I remote logon to a DICE machine with ssh, I cannot create
   files/directories and got an error 'permission denied' even those I
   tried chmod 666.
A: It could be related to Kerberos ticket for AFS. See:
     http://computing.help.inf.ed.ac.uk/kerberos
     http://computing.help.inf.ed.ac.uk/informatics-filesystem
  Note that 'chmod' does not work for AFS file systems.
  The AFS's permissions of the directory can be checked with 'fs listacl'.

Q: How can I run a long job?
   My script was terminated. Why?
A: See
   http://computing.help.inf.ed.ac.uk/ssh-timeouts-home
   http://computing.help.inf.ed.ac.uk/afs-top-ten-tips#Tip07