IMPORTANT: Prerequisites: PMR is a pre-requisite for this course, MLPR is a co-requisite for this course.
Welcome! The idea behind this course is to learn to apply modern machine learning techniques, such as those discussed in IAML, MLPR, and PMR, to real world problems. We will discuss data mining, visualization, how to evaluate your results, and a few practical algorithms for clustering and classification. A primary component of the course will be case studies: You will read (and present!) recent research papers from the machine learning and data mining literature. See the course descriptor for more information.
There will be three lab classes. These are intended to given a brief intro to some software packages that you might decide to use in the mini-project. They are not marked, but I encourage you to attend. Hopefully they should not be too difficult. There is no formal allocation to the two lab sessions. Simply show up to whichever of the sessions suits you.
The lab sheets will be posted below prior to the lab sessions.
| 29 Jan, 4pm | Choose paper and group for presentation. Submit to TA via email (see address at top of page). |
| 11 Feb | Choose data set and group for miniproject. Submit to TA via email (see address at top of page) |
| 1 March (in class) | Paper presentations begin. |
| 28 Feb, 4pm | Submit progress report for miniproject. |
| 21 Mar, 4pm | MINIPROJECT DUE. Submit to ITO. No extensions. |
N.B. Readings in the list below are examinable. The lecture slides are available from the NB discussion site.
| Date | Topic | |
|---|---|---|
| 1 | 18 Jan | Introduction,
Overview of Data Mining
, Visualizing Data
|
| 2 | 25 Jan |
Decision trees
, Ensemble methods
Reading: Rob Shapire boosting tutorial (Sections 4-8 not examinable) Reading: Leo Breiman, Bagging predictors, Machine Learning, 1996 Reading: Section 9.2 of Hastie, Tibshirani, and Friedman Additional reading: Murphy, Sections 16.1, 16.2 (skip 16.2.5 and 16.2.6), 16.4.1, 16.4.3 |
| 3 | 1 Feb | Evaluation of Learning Algorithms
Lab 1 this week (29 Jan). Readings: Fawcett, T. (2003). ROC Graphs: Notes and Practical Considerations for Researchers. HP Labs Tech Report HPL-2003-4 Dietterich, T. G., (1998). Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms. Neural Computation, 10 (7) 1895-1924 |
| 4 | 8 Feb | Applications: Topic Models
Reading: HTF, Section 14.5.1 Reading: Thomas Hofmann, Probabilistic Latent Semantic Analysis, Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence. (1999) Reading: Blei, Ng, and Jordan. Latent Dirichlet Allocation. Journal of Machine Learning Research 3 (2003) 993-1022. Reading: Murphy, 27.3.1, 27.3.2, 27.3.3 Lab 2 this week (5 Feb). |
| 5 | 15 Feb | Finish topic models. Association Rules Reading: HTF 14.2 Lab 3 this week (12 Feb). |
| 22 Feb | NO CLASS: Innovative Learning Week | |
| 6 | 1 Mar | Paper presentations |
| 7 | 8 Mar | Paper presentations |
| 8 | 15 Mar | Paper presentations |
| 9 | 22 Mar | Paper presentations |
This page is maintained by Charles Sutton.
|
Informatics Forum, 10 Crichton Street, Edinburgh, EH8 9AB, Scotland, UK
Tel: +44 131 651 5661, Fax: +44 131 651 1426, E-mail: school-office@inf.ed.ac.uk Please contact our webadmin with any comments or corrections. Logging and Cookies Unless explicitly stated otherwise, all material is copyright © The University of Edinburgh |