ASR coursework Q&A Last updated: 2018/02/12 18:05:51 --- Working in pairs --------------------------------------------------- Q: I cannot find a partner for the coursework. A: Contact Hiroshi Shimodaira by Monday 19th February via email with the following information: - Your name and UUN - Degree-programme name - Programming skills (especially shell scripts) - Any other information that you'd like to add if any Q: Can I do my coursework alone rather working together in a pair? A: Yes, but you should understand that the marking of coursework is done in the same manner for pairs. If you decide to work alone, please let me (Hiroshi) know without delay. --- Task specifications in the assignment ------------------------------ Q: Is Task 1 word-level or phone-level? A: Word level. All the tasks in the assignment are word recognition tasks. Q: Should we not compute normalised features for any of questions 1.1, 1.2 or 1.3 (i.e create a cmvn.scp file from the corresponding feats.scp file pointing to mfcc/fbank/plp features)? A: I'm assuming the setting used in the original script, but either is fine. However it should be consistent in 1.1, 1.2 and 1.3, and you should (always) clarify the setting of your experiment in your report. Q: Scripts to submit From the description in the assignment I'm not very clear what exactly the script for task 1.1 should automate. In particular, after looking at other Kaldi recipes should the script for task 1.1 do: * Data preparation (setting up timid and wsj train and test directories) * Feature extraction (mfcc+cmvn for task 1.1) Also i have noticed that Kaldi recipes don't contain actual loops over all possible hyperparameter values. Thus, i assume its fine to run the for loop separately in the terminal? A: First of all, I'd like you to understand why we ask you to submit your code. (see sections 1 and 2 in the assignment sheet) The scripts to submit should contain sufficient code for the marker to replicate the experiment you report in your report. For example, 'exp_mono_t1.sh', is supposed to contain the loops for all the hyperparameter values that you tried and you report in your report, otherwise it is not considered as an evidence of your experiment. If you change parameter values manually outside the script, the script cannot be used as an evidence of the experiment. Another reason why we ask you to submit those scripts is to encourage you to write scripts that automate your experiments so that you don't need to change parameters/conditions manually, which will save your time a lot. On the other hand, you don't need to include the code for data preparation or feature extraction, as long as you use the default features used in lab3 (word recognition). Q: Gender dependent model testing In task 3, should the tests for gender dependent models be done on the provided test dataset (containing both genders) or should we also split the test dataset by gender and test only on the single gender subset. A: It is you who decide. My advice would be that you clarify the purpose of your experiment and what conclusions you'd like to make based on the result before you run experiments, which will give you ideas what you should do. --- Experiments with kaldi ------------------------------------------- Q: Are there any other Kaldi documentations rather than the official one? A: Old forums: https://sourceforge.net/p/kaldi/discussion/ New forums: http://kaldi-asr.org/forums.html Apart from these sources you can look at the source code itself, look at example recipes: https://github.com/kaldi-asr/kaldi/tree/master/egs Q: How I can find the dimension of feature vector? A: Try the command, 'feat-to-dim'. Q: Should I use diffident experiment directories (e.g. exp1, exp2,...) for different experiments? A: Yes, that will be safe to avoid errors / unexpected results. Note that you should remove unnecessary files/directories once the experiment finishes in order to save your disk usage. Instead of creating new experiment directories, you can stick to a single experiment directory (e.g. exp), but you should remove the contents of the directory before you run a new experiment. Q: When I redo a decoding experiment with the same conditions, should I remove the old directory for decoding (e.g. decode_test)? A: Yes, you should remove it - Kaldi decoding scripts use cache files, which could cause unexpected effects on your subsequent runs. Q: Where can I find WERs for word recognition experiments? A: The scores are stored in files within the decode directory that are for example called wer_10_0.0, wer_11_0.0, etc. The first number is the language model scaling factor (LMWT) and the second number is the word insertion penalty (WIP). If you open any of those files, e.g. exp/word/mono/decode_test/wer_10_0.0 You will see a breakdown of WER, SER and insertion, deletion and substitution errors. You can use utils/best_wer.sh to automatically find the best one amongst the files above. Note that the scores we got in Lab 1 and Lab 2 were not WER, but PER (phone error rate) in fact, because the task was phone recognition rather than word recognition (which we did in Lab 3). Q: When I was running the decode phase: steps/decode.sh --nj 4 exp/word/tri1/graph data/test_words exp/word/tri1/decode_test the terminal said: decode.sh: no such file exp/word/tri1/graph/HCLG.fst What should I do? A: The previous command needs to have finished prior to decoding: utils/mkgraph.sh data/lang_wsj_test_bg exp/word/tri1 exp/word/tri1/graph This should create the HCLG if it finishes successfully. Q: How many Gaussians is too many Gaussians? A: As you increase the number of Gaussians, WER on test data will decrease, but it will start increasing from a certain point (or a region), which I would like you to find. I would suggest that you change the number exponentially and draw a graph of WER at first, which will give you an idea of what range you should try. You then look into a bit more details next. Q: How can we implement FBANK and PLP, and what dimensionality of feature should we use for them? A: You can find sample scripts in the directory, 'steps'. You can decide the dimensionality by yourself - you could choose the one most commonly used, or the same one as MFCC. You should think about what conditions you should use if you'd like to compare. Q: How can I change the number of Gaussian components? How can I pass values to scripts? A: You can specify the total number by this way for example: steps/train_mono.sh --totgauss 1000 NB: this way of passing values to a script is only possible if the script defines the corresponding variable and has the following line in it (after the declaration of the variable) . parse_options.sh Q: Where can I find log likelihoods on training and test sets? A: - For log likelihoods on a training set, The original information can be found in align.*.*.log in the log directory, but it is for each job (data partition) when you use multiple jobs (normally nj==4). The summary (average log-likelihood) is shown in the output of the training script (e.g. train_mono.sh and train_deltas.sh), e.g.: exp/word/tri: nj=4 align prob=-52.66 over 3.12h .... (where "-52.66" is the average log-likelihood) The same information can be obtained by running steps/info/gmm_dir_info.pl $dir (where $dir is the directory for experiment) - For log likelihoods on a test set, The original information for each job can be found in decode.JOB.log in decode_test/log, where JOB=1,...,nj. For example, there is a line: LOG (gmm-latgen-faster:main():gmm-latgen-faster.cc:181) Overall log-likelihood per frame is -8.80672 over 15130 frames. The average log-likelihood over all the jobs should be obtained as a weighted sum to reflect the different number of frames in each job. If you have no idea of this weighted sum, it is ok that you just use the log-likelihood for a specific job (e.g. 1) or you just calculate the simple arithmetic average over the jobs. In either case, you should clarify the approach taken in your report. --- Computing systems/environment --------------------------------- Q: Saving scripts in steps When I try to save a new script in the steps folder, I get a "Permission denied" error. Is there any way around this other than putting the scripts in an unlocked directory? A: The steps, utils and local folders are shared among all students so that we can make updates if needed, and you have no write permissions for them. For your needs I would make another directory, for example my-local, in which to store your scripts. Q: When I remote logon to a DICE machine with ssh, I cannot create files/directories and got an error 'permission denied' even those I tried chmod 666. A: It could be related to Kerberos ticket for AFS. See: http://computing.help.inf.ed.ac.uk/kerberos http://computing.help.inf.ed.ac.uk/informatics-filesystem Note that 'chmod' does not work for AFS file systems. The AFS's permissions of the directory can be checked with 'fs listacl'. Q: How can I run a long job? My script was terminated. Why? A: See http://computing.help.inf.ed.ac.uk/ssh-timeouts-home http://computing.help.inf.ed.ac.uk/afs-top-ten-tips#Tip07