Information theory, Autumn 2013

Archive of old course material: See the current course page.

Information theory describes the fundamental limits on our ability to store, process and communicate data, whether in natural or artificial systems. Understanding and approaching these limits is important in a wide variety of topics in informatics, including compression, communication, and learning from data. For more information, see the 5 min introductory video and preparation advice, and the course catalogue entry.

Lectures: Tuesdays and Fridays both at 14:10–15:00, in D.02 Forrest Hill, [map1, map2]
First lecture: Tuesday, 17 September 2013.
Tutorials: Tuesdays 3:10pm, Thursdays 10am, or Fridays 3:10pm. Start Tuesday 1 October. Question sheets.
Lecturer: Iain Murray

No lecture: Friday, 11 October 2013 (lecturer at exam meeting).
No tutorials: Tuesday 1, Thursday 10, or Friday 11 October 2013 (week 4).
Last lecture / review: probably Friday, 22 November 2013.

Information for PhD students wishing to audit the course.

Quick links: News — Help and feedback — Coursework and Assessment — Reading — Course notes

News

I email the whole class occasionally for important announcements, but you should check the course website regularly.

Selected news:

2013-12-01: Assignment feedback posted, pick up your script with personalized feedback from the ITO.

2013-11-15: Tutorial 6 answers posted, and on NB.

2013-10-21: Assignment and Tutorial 4 posted, and both are on NB.

2013-09-27: Office hour changed to Wednesdays 12:10–1pm. Extra tutorial slot announced.

2013-09-24: First tutorial released. Tutorials will now start in w3, Thur 3 / Fri 4 October. There will be no tutorial the following week.

Help and feedback

Written feedback will be given with the assessed assignment (see below). However, as with all MSc courses, most of your time should be spent on wider study of the course. In each lecture we will discuss some of the examinable material, and review previous material. The tutorials are even more interactive, and a time to check that your problem solving is up to standard before the assignment and exam.

At times you may find that you need private feedback or help. If you would like detailed, personalized feedback on your answers to any questions from the tutorials or textbook, you can give me any of your unassessed written work at any time. It will be returned within 4 days. Ways to contact me include:

There is usually some time to catch me in person before the start of each lecture.
You can drop by my office (IF 1.46) during office hour: Wednesdays 12:10–1pm, weeks 2–11 during Semester 1. If you can’t make this time, email me to ask for another.

Important exception: I do not meet students, or guarantee prompt replies, in the final week before the exam. This year I won’t be in Edinburgh, or have particularly good access to email from 9 December. Ask your questions before then if you can!

I find it really useful to hear how I can improve the delivery of the course. Please feel free to contact me directly with constructive criticism, as well as your questions. I used to elect class representatives for anonymous feedback, but they never contacted me. If you would like to send anonymous comments, send them via the MSc class reps, or in writing via the ITO. At the end of the course, please fill out the course questionnaire. BUT don't leave it until then to give me your feedback! Please email about any problems as soon as they arise.

Coursework and Assessment

Obtain and do the tutorial questions before each tutorial. Tutorials start on Thursday of week 3. Unlike the assignment, tutorial work is not used to decide your final mark. You may, and are encouraged to, discuss the tutorial work with your class-mates.

The assessed assignment was made available on 21 October and is due Thursday 21 November, 4pm, by in-person submission to the ITO. Marks and feedback will be available from the ITO by 5 December, probably earlier, but not until after anyone with an extension has submitted.

Answering some of the assignment questions will require some programming in the language of your choice, the code itself will not be assessed. Programming issues special to this course will be discussed in lectures and tutorials, but you will not be taught a programming language. You must be prepared to write code on your own to take this course.

Past and mock IT exams are available. Informatics and the University also archive some past exam papers (accessible from within the University), but there are usually fewer IT papers available there. Unlike most courses, the exam will be in the December diet. You must be available in Edinburgh to sit this exam.

By request, I created some optional Review Questions (2up).

Reading

The recommended course text is: Information Theory, Inference, and Learning Algorithms, David MacKay, CUP 2003. Some of the lecture notes make cross-references to this text, and some tutorial exercises will come from it. This book is freely available online, although I strongly advise buying a hard copy unless you have a good way to read e-books.

Those who like to absorb material through a series of theorems and lemmas may prefer Elements of Information Theory (Second Ed.), Cover and Thomas, Wiley 2006. Purchasing this book is not required.

You will need to know the mathematics in sections 1 and 3 of my cribsheet (2-up), which I made for another course. And have a thorough understanding of logarithms! You should also review expectations and summations (2-up) of random variables.

Shannon’s seminal paper (very readable) and UCSD’s video on the legacy of Claude Shannon.

Some programming notes relevant to the course.

Course notes and outline

More information on the course and its motivation can be found in the official course catalogue entry and the original case for support.

Each item below corresponds roughly to a week of lectures, but I won’t stick exactly to the advertised schedule. The relevant chapters refer to the recommended course text.

The lecture notes are slide sets with some additional denser notes. In the lectures I will present most of the material on the whiteboard instead of using the slides. There are “4-up” and “9-up” contact sheets of the slides to save paper when printing. If you want to format these differently you could tweak the make4up/make9up shell scripts that made them (or try the GUI print settings in your PDF viewer).

Introduction to Information Theory: digital communication, error correcting codes, representing data, compression, counting large sets.
The magnetic media inside hard-disks frequently corrupts bits. How is reliable storage possible? How should we go about making the most efficient use of the bits available?
Chapter 1, pp1–8, and pp14–15.
Slides, 4-up, 9-up.
Measuring and representing information. Probabilities over many variables. The typical set. Information content. Entropy.
What is “information”? Quantifying information and knowing its mathematical properties should tell us something about the resources we need to store it and move it around, and what can be achieved with the information we have.
Chapter 4 covers the same material. The notation and derivation of the Source Coding Theorem is slightly different from the slides. You won’t be asked to regurgitate definitions of things like T_m from the slides or H_δ from the book. You do need to understand and be able to apply the ideas.
Slides, 4-up, 9-up.
Notes on expectations and summations, 2-up.
Practical compression 1: symbol codes Symbol codes. Huffman codes.
How can we compress (close to) the entropy of a source in practice? We start with symbol-coding, which gives the answer in some restricted circumstances.
Chapters 2, 5.
Slides, 4-up, 9-up.
Practical compression 2: stream codes. Probabilistic and Human prediction. Arithmetic compression. Dasher.
We move onto state-of-the-art compression. How can we overcome all(!) the limitations of the symbol codes we have considered so far: use more powerful models and group large numbers of symbols together.
Chapter 6.
Slides, 4-up, 9-up.
Modelling and inference for stream codes: Beta and Dirichlet predictions. Models for text and images. PPM.
Information theory shows that compression and (Bayesian) probabilistic methods are intimately linked in theory. How can we build models for compression?
Chapters 3, 6.
Slides, 4-up, 9-up.
Communication channels and information: A framework for communication. Redundancy. Models of channels. Back channels. Information theoretic quantities.
Chapters 1, 2, 8, 9, 10.
Slides, 4-up, 9-up.
Noisy channel coding. Check digits. Block codes on extended channels; Hamming Codes. The noisy channel coding theorem. Erasure channels and digital fountain codes.
Chapters 1, 9, 10, 50.
Slides, 4-up, 9-up.
Theory review and sparse graph codes. Review of information theory proofs. Low-density-parity-check codes for the binary symmetric channel and belief propagation.
Chapters 10, 13, 26, 47.
Slides, 4-up, 9-up.
Sending less than the information content: rate distortion, hash codes.
If you can only send N/3 bits, can you do better than dropping 2N/3 bits of your file on the floor? How do web browsers alert you to bad sites without you needing a list of the bad sites, or telling a server which site you are visiting? What are some active research topics?
Chapter 10, 12. Slides, 4-up, 9-up.

All the slides/notes in a single PDF: 1-up, 4-up, 9-up.
In case you want to merge some slides yourself: pdftk can merge the slides, then optionally 4-up or 9-up.

This page maintained by Iain Murray.
Last updated: 2017/05/03 08:49:00

Home : Teaching : Courses : It