ASR coursework Q&A Last updated: 2019/03/13 11:02:52 --- Working in pairs --------------------------------------------------- Q: I cannot find a partner for the coursework. A: Please use the "Search For Teammates" on the Piazza forum of this course. Q: Can I do my coursework alone rather working together in a pair? A: Yes, but you should understand that the marking of coursework is done in the same manner for pairs. If you decide to work alone, please let me (Hiroshi) know. --- Task specifications in the assignment ------------------------------ Q: Is Task-1 word-level or phone-level? A: Word level. All the tasks in the assignment are word recognition tasks. Q: Scripts to submit From the description in the assignment I'm not very clear what exactly the script for task 1.1 should automate. In particular, after looking at other Kaldi recipes should the script for task 1.1 do: * Data preparation (setting up timid and wsj train and test directories) * Feature extraction (mfcc+cmvn for task 1.1) Also i have noticed that Kaldi recipes don't contain actual loops over all possible hyperparameter values. Thus, i assume its fine to run the for loop separately in the terminal? A: First of all, I'd like you to understand why we ask you to submit your code. (see sections 1 and 2 in the assignment sheet) The scripts to submit should contain sufficient code for the marker to replicate the experiment you report in your report. For example, 'exp_task1_1.sh', is supposed to contain the loops for all the hyperparameter values that you tried and you report in your report, otherwise it is not considered as an evidence of your experiment. If you change parameter values manually outside the script, the script cannot be used as an evidence of the experiment. Another reason why we ask you to submit those scripts is to encourage you to write scripts that automate your experiments so that you don't need to change parameters/conditions manually, which will save your time a lot. On the other hand, you don't need to include the code for data preparation or feature extraction, as long as you use the default features used in Lab-3 (word recognition). Q: How many tasks are sufficient to get full marks for Task-3? A: Task-3 is intended for students to carry out a mini research project and write a short scientific report. Marking is basically done based on the quality of the project and report. Just doing many tasks/experiments and showing only the results will not be awarded many marks. It is, on the other hand, possible for a single project with a single system to be awarded full marks if it is really good - the project is of high quality and full of original ideas/insights, potentially publishable as a conference paper for example. I'm looking forward to seeing unique and interesting projects. --- Experiments with kaldi ------------------------------------------- Q: Are there any other Kaldi documentations rather than the official one? A: Old forums: https://sourceforge.net/p/kaldi/discussion/ New forums: http://kaldi-asr.org/forums.html Apart from these sources you can look at the source code itself, look at example recipes: https://github.com/kaldi-asr/kaldi/tree/master/egs Q: How I can find the dimension of feature vector? A: Try the command, 'feat-to-dim'. Q: Should I use diffident experiment directories (e.g. exp1, exp2,...) for different experiments? A: Yes, that will be safe to avoid errors / unexpected results / overwriting previous results. Note that you should remove unnecessary files/directories once the experiment finishes in order to save your disk usage. Instead of creating new experiment directories, you can stick to a single experiment directory (e.g. exp), but you should remove the contents of the directory before you run a new experiment. Q: When I redo a decoding experiment with the same conditions, should I remove the old directory for decoding (e.g. decode_test)? A: Yes, you should remove it - Kaldi decoding scripts use cache files, which could cause unexpected effects on your subsequent runs. Q: Where can I find WERs for word recognition experiments? A: The scores are stored in files within the decode directory that are for example called wer_10_0.0, wer_11_0.0, etc. The first number is the language model scaling factor (LMWT) and the second number is the word insertion penalty (WIP). If you open any of those files, e.g. exp/word/mono/decode_test/wer_10_0.0 You will see a breakdown of WER, SER and insertion, deletion and substitution errors. You can use utils/best_wer.sh to automatically find the best one amongst the files above. Note that the scores we got in Lab 1 and Lab 2 were not WER, but PER (phone error rate) in fact, because the task was phone recognition rather than word recognition (which we did in Lab 3). Q: When I was running the decode phase: steps/decode.sh --nj 4 exp/word/tri1/graph data/test_words exp/word/tri1/decode_test the terminal said: decode.sh: no such file exp/word/tri1/graph/HCLG.fst What should I do? A: You need to get the following done properly prior to decoding: utils/mkgraph.sh data/lang_wsj_test_bg exp/word/tri1 exp/word/tri1/graph This should create the HCLG if it finishes successfully. Q: How many Gaussians is too many Gaussians? A: As you increase the number of Gaussians, WER on test data will decrease, but it will start increasing from a certain point (or a region), which I would like you to find. I would suggest that you change the number exponentially and draw a graph of WER at first, which will give you an idea of what range you should try. You then look into a bit more details next. Q: How can I change the number of Gaussian components? How can I pass values to scripts? A: You can specify the total number by this way for example: steps/train_mono.sh --totgauss 1000 NB: this way of passing values to a script is only possible if the script defines the corresponding variable and has the following line in it (after the declaration of the variable) . parse_options.sh Q: Where can I find log likelihoods on training and test sets? A: - For log likelihoods on a training set, The original information can be found in align.*.*.log in the log directory, but it is for each job (data partition) when you use multiple jobs (normally nj==4). The summary (average log-likelihood) is shown in the output of the training script (e.g. train_mono.sh and train_deltas.sh), e.g.: exp/word/tri: nj=4 align prob=-52.66 over 3.12h .... (where "-52.66" is the average log-likelihood) The same information can be obtained by running steps/info/gmm_dir_info.pl $dir (where $dir is the directory for experiment) - For log likelihoods on a test set, The original information for each job can be found in decode.JOB.log in decode_test/log, where JOB=1,...,nj. For example, there is a line: LOG (gmm-latgen-faster:main():gmm-latgen-faster.cc:181) Overall log-likelihood per frame is -8.80672 over 15130 frames. The average log-likelihood over all the jobs should be obtained as a weighted sum to reflect the different number of frames in each job. If you have no idea of this weighted sum, it is ok that you just use the log-likelihood for a specific job (e.g. 1) or you just calculate the simple arithmetic average over the jobs. In either case, you should clarify the approach taken in your report. Q: How can we implement FBANK and PLP, and what dimensionality of feature should we use for them? A: You can find sample scripts in the directory, 'steps'. You can decide the dimensionality by yourself - you could choose the one most commonly used, or the same one as MFCC. You should think about what conditions you should use if you'd like to compare. --- Computing systems/environment --------------------------------- Q: Saving scripts in steps When I try to save a new script in the steps folder, I get a "Permission denied" error. Is there any way around this other than putting the scripts in an unlocked directory? A: The steps, utils and local folders are shared among all students so that we can make updates if needed, and you have no write permissions for them. For your needs I would make another directory, for example my-local, in which to store your scripts. Q: When I remote logon to a DICE machine with ssh, I cannot create files/directories and got an error 'permission denied' even those I tried chmod 666. A: It could be related to Kerberos ticket for AFS. See: http://computing.help.inf.ed.ac.uk/kerberos http://computing.help.inf.ed.ac.uk/informatics-filesystem Note that 'chmod' does not work for AFS file systems. The AFS's permissions of the directory can be checked with 'fs listacl'. Q: How can I run a long job? My script was terminated. Why? A: See http://computing.help.inf.ed.ac.uk/ssh-timeouts-home http://computing.help.inf.ed.ac.uk/afs-top-ten-tips#Tip07 Q: I'd like to have my WorkDir shared with my partner. How is that possible? A: Contact me (Hiroshi), letting me know your partner's name and UUN, so that I will change AFS permissions so that they have a read permission to each other. If both of you would like to have a write permission to each other, let me know. Please note that the change will apply to all the directories under WorkDir.