Course details


Based on a previous version by Michael Gutmann.

Timetable

Semester week Date (place) Activity Date (place) Activity
wk1     Thu 17/01 (ALT) lecture 1
wk2 Wed 23/01 (AT-6.06) lab 1 Thu 24/01 (ALT) lecture 2
wk3 Wed 30/01 (AT-6.06) lab 2 Thu 31/01 (LG34) lecture 3
wk4 Wed 06/02 (AT-6.06) lab 3 Thu 07/02 (LTC) lecture 4
wk5 Wed 13/02 (AT-6.06) lab 4 Thu 14/02 (LTC) lecture 5
wk6        
wk7 Wed 06/03 (AT-6.06) poster session Thu 07/03 (F.21) poster session
wk8 Wed 13/03 (AT-6.06) poster session    
wk9 Wed 20/03 (AT-6.06) poster session Thu 21/03 (F.21) poster session
wk10        
wk11     Thu 04/04 (LTC) Recap, Q&A

Important dates

Deadline for your paper preference Fri 8 Feb 2019, 4pm
Deadline for your project info Fri 15 Feb 2019, 4pm
Poster PDF deadline Mon 25 February 2019, 9am
Mini-project interim report deadline Tue 12 March 2019, 4pm
Mini-project final report deadline Fri 5 April 2019, 4pm
Exam Exam diets

Labs (weeks 2-5), poster sessions (weeks 7-9):
Wednesdays: 09:00 - 10:50 (group 1), 11:10 - 13:00 (group 2)
Appleton Tower, room 6.06

Lectures (weeks 1-5, 11), poster sessions (weeks 7, 9):
Thursdays: 15:10 - 17:00
Medical School, Room 425 Anatomy Lecture Theatre (weeks 1, 2)
Patersons Land, LG34 (week 3)
David Hume Tower, Lecture Theatre C (weeks 4, 5, 11)
7 George Square, F.21 (weeks 7, 9)

Lectures

The lecture is accompanied by lecture notes (they will be updated as we progress).

  • Lecture 1
    Introduction to the data analysis process, simple descriptions and preprocessing of data
    opening slides
  • Lecture 2
    Principal component analysis
  • Lecture 3
    Probabilistic PCA, dimensionality reduction by PCA
  • Lecture 4
    Dimensionality reduction by kernel PCA, multidimensional scaling, isomap
  • Lecture 5
    Evaluating the performance in predictive modelling (e.g. classification and regression), techniques for choosing hyper-parameters

Computer labs

The course has four computer labs on topics introduced in the lecture. The labs will allow you to play with different methods to gain some intuitive understanding and provide you with practical tools for the mini-project. There is a GitHub repository for the labs.

  • Lab 1 on simple data descriptions and preprocessing
  • Lab 2 on principal component analysis
  • Lab 3 on dimensionality reduction
  • Lab 4 on performance evaluation and hyperparameter/model selection

Poster presentations

In the second half of the course, we will have poster presentations on some of the papers listed on the papers page. Feel free to propose papers yourself but please check with the lecturer about suitability.

Detailed instructions and information on the format of the presentations can be found on the papers page.

Mini-projects

The goal of the project is to apply data mining methods to a real dataset. We have a list of potential datasets (same as for the IRDS course). For each dataset, the web page gives a description of the task to be undertaken. You will produce a project report that will be assessed.

Please have a look at the mini-projects page for detailed instructions and information on the format of the report.

Course grade

The breakdown of your total course grade is as follows: 50%: exam; 35%: mini-project; 15%: poster presentation.