ASR coursework Q&A           Last updated: 2019/03/13 11:02:52

--- Working in pairs ---------------------------------------------------
Q: I cannot find a partner for the coursework.
A: Please use the "Search For Teammates" on the Piazza forum of this course.

Q: Can I do my coursework alone rather working together in a pair?
A: Yes, but you should understand that the marking of coursework is
   done in  the same manner for pairs.
   If you decide to work alone, please let me (Hiroshi) know.

--- Task specifications in the assignment ------------------------------
Q: Is Task-1 word-level or phone-level?
A: Word level. All the tasks in the assignment are word recognition tasks.

Q: Scripts to submit
   From the description in the assignment I'm not very clear what
   exactly the script for task 1.1 should automate. In particular,
   after looking at other Kaldi recipes should the script for task 1.1 do: 
     * Data preparation (setting up timid and wsj train and test directories)
     * Feature extraction (mfcc+cmvn for task 1.1)
   Also i have noticed that Kaldi recipes don't contain actual loops
   over all possible hyperparameter values. Thus, i assume its fine to
   run the for loop separately in the terminal? 
A: First of all, I'd like you to understand why we ask you to submit
   your code. (see sections 1 and 2 in the assignment sheet)  
   The scripts to submit should contain sufficient code for the marker
   to replicate the experiment you report in your report.
   For example, 'exp_task1_1.sh', is supposed to contain the loops for
   all the hyperparameter values that you tried and you report in your
   report, otherwise it is not considered as an evidence of your experiment.
     If you change parameter values manually outside the script,  the
   script cannot be used as an evidence of the experiment. 
     Another reason why we ask you to submit those scripts is to
   encourage you to write scripts that automate your experiments so that
   you don't need to change parameters/conditions manually, which will
   save your time a lot. 
     On the other hand, you don't need to include the code for data
   preparation or feature extraction, as long as you use the default
   features used in Lab-3 (word recognition). 

Q: How many tasks are sufficient to get full marks for Task-3?
A: Task-3 is intended for students to carry out a mini research
   project and write a short scientific report. Marking is basically done 
   based on the quality of the project and report. 
   Just doing many tasks/experiments and showing only the results will
   not be awarded many marks. It is, on the other hand, possible for a
   single project with a single system to be awarded full marks if it
   is really good - the project is of high quality and full of
   original ideas/insights, potentially publishable as a conference
   paper for example. I'm looking forward to seeing unique and
   interesting projects. 

--- Experiments with kaldi -------------------------------------------

Q: Are there any other Kaldi documentations rather than the official one?
A: Old forums: https://sourceforge.net/p/kaldi/discussion/
   New forums: http://kaldi-asr.org/forums.html
   Apart from these sources you can look at the source code itself,
   look at example recipes:
     https://github.com/kaldi-asr/kaldi/tree/master/egs	

Q: How I can find the dimension of feature vector?
A: Try the command, 'feat-to-dim'.

Q: Should I use diffident experiment directories (e.g. exp1, exp2,...) for
   different experiments?  
A: Yes, that will be safe to avoid errors / unexpected results /
   overwriting previous results.
   Note that you should remove unnecessary files/directories once the
   experiment finishes in order to save your disk usage.
   Instead of creating new experiment directories, you can stick to a
   single experiment directory (e.g. exp), but you should remove the
   contents of the directory before you run a new experiment.

Q: When I redo a decoding experiment with the same conditions, should
   I remove the old directory for decoding (e.g. decode_test)?
A: Yes, you should remove it - Kaldi decoding scripts use cache files,
   which could cause unexpected effects on your subsequent runs.

Q: Where can I find WERs for word recognition experiments?
A: The scores are stored in files within the decode directory that are
   for example called wer_10_0.0, wer_11_0.0, etc. The first number is
   the language model scaling factor (LMWT) and the second number is the
   word insertion penalty (WIP). If you open any of those files, e.g. 
       exp/word/mono/decode_test/wer_10_0.0
   You will see a breakdown of WER, SER and insertion, deletion and
   substitution errors. You can use utils/best_wer.sh to automatically
   find the best one amongst the files above. 

  Note that the scores we got in Lab 1 and Lab 2 were not WER, but PER
  (phone error rate) in fact, because the task was phone recognition
  rather than word recognition (which we did in Lab 3).

Q: When I was running the decode phase:
    steps/decode.sh --nj 4 exp/word/tri1/graph data/test_words exp/word/tri1/decode_test
   the terminal said:
      decode.sh: no such file exp/word/tri1/graph/HCLG.fst
   What should I do?
A: You need to get the following done properly prior to decoding:
      utils/mkgraph.sh data/lang_wsj_test_bg exp/word/tri1 exp/word/tri1/graph
   This should create the HCLG if it finishes successfully.

Q: How many Gaussians is too many Gaussians?
A: As you increase the number of Gaussians, WER on test data will
   decrease, but it will start increasing from a certain point
   (or a region), which I would like you to find. 
   I would suggest that you change the number exponentially and draw a
   graph of WER at first, which will give you an idea of what range
   you should try. You then look into a bit more details next.

Q: How can I change the number of Gaussian components?
   How can I pass values to scripts?

A: You can specify the total number by this way for example:
     steps/train_mono.sh --totgauss 1000
   NB: this way of passing values to a script is only possible if
    the script defines the corresponding variable and has the
    following line in it (after the declaration of the variable)
        . parse_options.sh

Q: Where can I find log likelihoods on training and test sets?
A: - For log likelihoods on a training set,
    The original information can be found in align.*.*.log in the log
  directory, but it is for each job (data partition) when you use
  multiple jobs (normally nj==4). The summary (average log-likelihood) is
  shown in the output of the training script (e.g. train_mono.sh and
  train_deltas.sh), e.g.:
     exp/word/tri: nj=4 align prob=-52.66 over 3.12h ....
   (where "-52.66" is the average log-likelihood)
  The same information can be obtained by running
     steps/info/gmm_dir_info.pl $dir
  (where $dir is the directory for experiment)   
  - For log likelihoods on a test set,
    The original information for each job can be found in decode.JOB.log
    in decode_test/log, where JOB=1,...,nj. For example, there is a line:
      LOG (gmm-latgen-faster:main():gmm-latgen-faster.cc:181) Overall log-likelihood per frame is -8.80672 over 15130 frames.    
    The average log-likelihood over all the jobs should be obtained
    as a weighted sum to reflect the different number of frames in
    each job. If you have no idea of this weighted sum, it is ok that
    you just use the log-likelihood for a specific job (e.g. 1) or
    you just calculate the simple arithmetic average over the jobs.
    In either case, you should clarify the approach taken in your report.

Q: How can we implement FBANK and PLP, and what dimensionality of
   feature should we use for them?
A: You can find sample scripts in the directory, 'steps'.
   You can decide the dimensionality by yourself - you could choose the
   one most commonly used, or the same one as MFCC. You should think
   about what conditions you should use if you'd like to compare.


--- Computing systems/environment ---------------------------------
Q: Saving scripts in steps
   When I try to save a new script in the steps folder, I get a
   "Permission denied" error. Is there any way around this other than
   putting the scripts in an unlocked directory? 
A: The steps, utils and local folders are shared among all students so
   that we can make updates if needed, and you have no write
   permissions for them. 
   For your needs I would make another directory, for example
   my-local, in which to store your scripts. 

Q: When I remote logon to a DICE machine with ssh, I cannot create
   files/directories and got an error 'permission denied' even those I
   tried chmod 666.
A: It could be related to Kerberos ticket for AFS. See:
     http://computing.help.inf.ed.ac.uk/kerberos
     http://computing.help.inf.ed.ac.uk/informatics-filesystem
  Note that 'chmod' does not work for AFS file systems.
  The AFS's permissions of the directory can be checked with 'fs listacl'.

Q: How can I run a long job?
   My script was terminated. Why?
A: See
   http://computing.help.inf.ed.ac.uk/ssh-timeouts-home
   http://computing.help.inf.ed.ac.uk/afs-top-ten-tips#Tip07

Q: I'd like to have my WorkDir shared with my partner. How is that possible?
A: Contact me (Hiroshi), letting me know your partner's name and UUN, so
   that I will change AFS permissions so that they have a read
   permission to each other. 
   If both of you would like to have a write permission to each other, let
   me know. Please note that the change will apply to all the
   directories under WorkDir.