MLPR class notes
This is an archive of a previous version of the course.
The 2017/18 notes are here.
You can step through the HTML version of these notes using the left and right
Each note links to a PDF version for better printing. However, if possible,
please annotate the HTML versions of the notes in the forum, to keep the
class's comments together. If the HTML notes don't render well for you, I
suggest trying in Chrome/Chromium. If you want quick access to the PDFs from
this page, you can toggle the
A rough indication of the schedule is given, although we won’t follow
- w0a – Course administration, html, pdf.
- w0b – Books useful for MLPR, html, pdf.
- w0c – MLPR background self-test, html, pdf. Answers: html, pdf.
- w0d – Maths background for MLPR, html, pdf.
- w0e – Programming in Matlab/Octave or Python, html, pdf.
- w0f – Expectations and sums of variables, html, pdf.
- w1a – Course Introduction, html, pdf.
- w1b – Linear regression, html, pdf.
- w1c – Linear regression, overfitting, and regularization, html, pdf.
- w2a – Training, Testing, and Evaluating Different Models, html, pdf.
- w2b – Univariate Gaussians, html, pdf. Answers: html, pdf.
- w2c – The Central Limit Theorem (CLT), html, pdf. Answers: html, pdf.
- w2d – Error bars, html, pdf.
- w2e – Multivariate Gaussians, html, pdf.
- w3a – Classification: Regression, Gaussians, and pre-processing, html, pdf.
- w3b – Regression and Gradients, html, pdf.
- w3c – Logistic Regression, html, pdf.
- w4a – Softmax and robust regressions, html, pdf.
- w4b – Neural networks introduction, html, pdf.
- w4c – More on fitting neural networks, html, pdf.
- w5a – Backpropagation of Derivatives, html, pdf.
- w5b – Autoencoders and Principal Components Analysis (PCA), html, pdf.
- w6a – Netflix Prize, html, pdf.
- w6b – Bayesian regression, html, pdf.
- w6c – Bayesian inference and prediction, html, pdf.
- w8a – Bayesian logistic regression and Laplace approximations, html, pdf.
- w8b – Computing logistic regression predictions, html, pdf.
- w8c – Variational objectives and KL Divergence, html, pdf.
- w10a – Sparsity and L1 regularization, html, pdf.
- w10b – More on optimization, html, pdf.
- w10c – Ensembles and model combination, html, pdf.
A coarse overview of major topics covered is below. Some principles aren't
taught alone as they're useful in multiple contexts, such as gradient-based
optimization, different regularization methods, ethics, and practical choices
such as feature engineering or numerical implementation.
- Linear regression and ML introduction
- Evaluating and choosing methods from the zoo of possibilities
- Multivariate Gaussians
- Classification, generative and discriminative models
- Neural Networks
- Learning low-dimensional representations
- Bayesian machine learning: linear regression, Gaussian processes and kernels
- Approximate Inference: Bayesian logistic regression, Laplace, Variational
- Gaussian mixture models
- Time allowing: Other principles: sparsity/L1, ensembles: combination vs averaging.