# 1.7 2020/04/03 17:52:54 ---- General questions ------------------------------------------------ Q: I cannot use Matlab on DICE remotely via XRDP service. P: Unfortunately you cannot use Matlab's graphical user interface (GUI) over XRDP, but you can use Matlab's command-line mode. If you need GUI, you will need to install Matlab or Octave on your own computer. (NB: Octave is a lightweight & free software and it has good compatibility with Matlab) Q: Is there any difference between lecture notes/slides and Matlab in terms of representation of vector/matrix? A: Yes, column vectors are commonly used in mathematical expressions (e.g. those in lecture notes and slides), whereas row vectors are normally used in Matlab to represent data. Notation may be also different. So, it is important that you always make sure which type of vector/matrix representation is used. Q: I've got a very small non-zero value, e.g. 2.6715E-16, which is supposed to be zero theoretically. Is this right? A: Yes, there is almost always numerical error, e.g. rounding error, when you use computers. For example, eigenvectors are orthogonal to each other, so that the inner product between two distinct eigenvectors should be zero theoretically, but you will find it is not the case for computers. (You can find more details on the internet.) Numerical error is, of course, taken into account when your coursework is checked with an auto-marking system. ---- Submissions ------------------------------------------------------ Q: What files should I submit? A: Please see the 'list of files to submit' in the coursework page. NB: the list shows the minimum set of files and it does not include other code files such as MyCov.m, which you should also submit so that your code runs properly on standard DICE machines without the need of additional software. ---- Task 1 ----------------------------------------------------------- Q: Task 1.1 - why is Y needed for task1_1(X, Y)? A: It is for debugging, and Y is not needed to calculate the covariance matrix of X. Q: What is the right size of the covariance matrix S? A: It should be D-by-D, where D=24. Q: Task 1.2 - how much should I write in my report? A: The task asks you to carry out a mini investigation project and report your findings using your own words. Please think what findings you would like to report and what graphs are suitable to explain the findings to other people. Do you think other people want to see many graphs? I assume no more than one page for Task 1.2. Q: Task 1.3 - Some of the eigenvalues I got are very small negative values. A: If the rank of matrix is reduced, some eigenvalues are zero theoretically, but they could be very small positive or negative values due to the limited accuracy of floating-point calculation. Q: Task 1.3 - what is 'cumulative variance'? A: Please see the latest version of coursework handout. Q: Task 1.3 - How to find the minimum number of PCA dimensions to cover X%? A: It refers to the formula shown at the bottom of slide 41 for Lecture 3. Let v[i] be the variance of i-th principal component (i.e. the variance of data on the i-th principal axis), cumvar[i] be the cumulative sum of v[j] for j = 1,...,i. To find the minimum dimension for the coverage of 70%, for example, you find the minimum k such that cumvar[k]/cumvar[D] >= 0.7, where D is the dimension of data, i.e. the number of variables. Q: Task 1.4 - Can I hard-code the number of classes? A: No, please determine the number of classes from the data passed to the function, task1_mgc_cv(). You can assume that class numbers are contiguous so that you can use max() or unique() to determine the number of classes. NB: this also applies to other numbers such as D (the number of features) and N (the number of samples). Q: Task 1.4 - I have no idea how to create partitions from the data set. A: Please see the supplemental document and example in the coursework page. Q: Task 1.4 - I have no idea how to create PMap. A: Please see the supplemental document and example in the coursework page. Q: Task 1.4 - Which normaliser should I use for the calculation of covariance matrix, 1/N or 1/(N-1)? A: You should use covariance matrices with ML estimation. For details, please see lecture notes and slides. Q: Task 1.4 - Should I apply the regularisation to all types of covariance matrix, irrelevantly to CovKind? A: Yes, please do so. Q: Task 1.4 - Should the confusion matrix for each test partition p represent frequencies or relative frequencies? A: Frequencies. NB: the final confusion matrix with k-fold CV should be represented in terms of relative frequency. For detail, see the supplemental document for Task 1 in the coursework page. ---- Task 2 ----------------------------------------------------------- Q: The vertices of Polygon_A are shown in anti-clockwise order, whereas those of Polygon_B are in clockwise order, right? A: Yes, they are. The correct information can be found in the latest version of coursework handout - insufficient/wrong information was provided in the old versions 0.9 and 0.9.1. Q: Task 2.1 - Why 'W' is a (D+1)-by-1 matrix, whereas 'X' is N-by-D? A: The first element, W(1), represents the bias term. See lecture note 11 and slide 11 for Lecture 11. Please note that, inside your function, you need to use an augmented version of X so that the first element/column is 1. Q: Task 2.1 - What is Y (i.e. the output of task2_hNeuron)? A: Y(i) = h(a), where a is the inner/dot product of W and the augmented vector of i-th input vector in X. Q: Task 2.3 - I have no idea what to do. A: Please see slides for Lecture 11,12,13. Q: Task 2.3 - what is the size of X and Y in Y = task2_hNN_A(X)? A: X is N-by-D, where D = 2 and N is the number of input data points. Y is a N-by-1 vector, whose i-th element is the output of the network for the i-th input data in X. NB: the same applies to task2_hNN_AB(X), task2_sNN_AB(X), but it does not apply to task2_hNeuron(X) or task2_sNeuron(X), which should work generally for any D (natural number). Q: Task 2.3 - is the peripheral of polygon_A included in Class 1? A: No, it was an error in the old handout (up to version 0.9.4). The peripheral should not be included in Class 1, but in Class 0, so that only the inside of the polygon is Class 1. Apologies. In case you have already implemented the original definition, there is no need for you to modify the code, but please indicate so in the code file and in your report.