IMPORTANT: Prerequisites: PMR is a pre-requisite for this course, MLPR is a co-requisite for this course.
Welcome! The idea behind this course is to learn to apply modern machine learning techniques, such as those discussed in IAML, MLPR, and PMR, to real world problems. We will discuss data mining, visualization, how to evaluate your results, and a few practical algorithms for clustering and classification. A primary component of the course will be case studies: You will read (and present!) recent research papers from the machine learning and data mining literature. See the course descriptor for more information.
| 26 Jan, 4pm | Choose paper and group for presentation. Submit to TA via email: Victor Hernandez-Urbina j.v.hernandez-urbina@sms.ed.ac.uk |
| 1 Feb, 12-14hrs | Lab 1: Data visualisation in R. |
| 8 Feb | Choose data set and group for miniproject. Submit to TA via email: Victor Hernandez-Urbina j.v.hernandez-urbina@sms.ed.ac.uk |
| 8 Feb, 12-14hrs | Lab 2: LSA in R. LDA in MALLET. |
| 27 Feb (in class) | Paper presentations begin. |
| 29 Feb, 12-14hrs | Lab 3: Classification and evaluation in R. |
| 1 Mar, 4pm | Submit progress report for miniproject. |
| 29 Mar, 4pm | MINIPROJECT DUE. Submit to ITO. No extensions. |
N.B. Readings in the list below are examinable.
| Date | Topic | |
|---|---|---|
| 1 | 20 Jan | Introduction about Course,
Overview of Data Mining
[slides]
Visualizing Data
[slides 4up]
[slides 1up]
Reading: HMS Chapters 1-3 |
| 2 | 27 Jan |
Decision trees
[slides 4up]
[slides 1up]
Ensemble methods
[slides 4up]
[slides 1up]
Reading: Rob Shapire boosting tutorial (Sections 4-8 not examinable) Reading: Leo Breiman, Bagging predictors, Machine Learning, 1996 Reading: Section 9.2 of Hastie, Tibshirani, and Friedman Reading: HMS Section 10.5 |
| 3 | 3 Feb | Applications: Topic Models
[slides]
[slides 1up] Reading: HTF, Section 14.5.1 Reading: Thomas Hofmann, Probabilistic Latent Semantic Analysis, Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence. (1999) Reading: Blei, Ng, and Jordan. Latent Dirichlet Allocation. Journal of Machine Learning Research 3 (2003) 993--1022. |
| 4 | 10 Feb | Topic Models (cont) |
| 5 | 17 Feb | Evaluation of Learning Algorithms
[slides 4up]
[slides 1up]
Readings: Fawcett, T. (2003). ROC Graphs: Notes and Practical Considerations for Researchers. HP Labs Tech Report HPL-2003-4 Dietterich, T. G., (1998). Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms. Neural Computation, 10 (7) 1895-1924 Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, (2008). Introduction to Information Retrieval, Cambridge University Press. Section 16.3 (only) |
| 24 Feb | NO CLASS: Innovative Learning Week | |
| 6 | 2 Mar | Collaborative Filtering Paper presentations (x2, see schedule) |
| 7 | 9 Mar | Association Rules
[slides] [slides 1up]
Reading: HMS Chapter 13 Paper presentations (x2, see schedule) |
| 8 | 16 Mar | Paper presentations (x3, see schedule) |
| 9 | 23 Mar | Paper presentations (x2, see schedule) |
This page is maintained by Charles Sutton and Victor Hernandez-Urbina.
|
Informatics Forum, 10 Crichton Street, Edinburgh, EH8 9AB, Scotland, UK
Tel: +44 131 651 5661, Fax: +44 131 651 1426, E-mail: school-office@inf.ed.ac.uk Please contact our webadmin with any comments or corrections. Unless explicitly stated otherwise, all material is copyright © The University of Edinburgh |