The course is structured into three core components:
- Lectures and computer labs from week 1 to 5, with a class-test in week 6
- Presentations of research papers from week 7 to 11
- Independent work on a mini-project from the end of week 5 onwards
Performance on all three components of the course, and to some extent your overall engagement, will determine your grade.
Note that the paper presentation and mini-project components are only open to those students who are taking this course for credits. Please only provide details for those components (through the forms) if you’re taking the course for credits
Lectures and labs
The course is based on comprehensive lecture notes, supporting lecture recordings and computer labs. The labs will allow you to gain some intuitive understanding of the methods introduced in the lectures and provide you with practical tools for the mini-project.
The main topics covered in the lectures are:
- Numerical and visual summaries of the data
- Preprocessing and principal component analysis (PCA)
- Dimensionality reduction (e.g. by PCA, multidimensional scaling, Isomap, t-SNE)
- Evaluating the performance in predictive modelling (e.g. classification and regression), techniques for choosing hyper-parameters
See the lectures page for the lecture materials, the weekly schedule, and information on the class test.
Paper presentation
In the second half of the course, we will have presentations on research papers across a number of Data Science areas listed on the presentations page.
Detailed instructions and information on the presentations can be found on the presentations page.
Mini-project
The goal of the mini-project is to apply data mining methods to a real data analysis problem and to produce a written report summarising your findings. You can choose among different datasets listed here.
Please have a look at the mini-projects page for detailed instructions and information on the format of the report.