Information theory, Autumn 2014

Archive of old course material: See the current course page.

Information theory describes the fundamental limits on our ability to store, process and communicate data, whether in natural or artificial systems. Understanding and approaching these limits is important in a wide variety of topics in informatics, including compression, communication, and learning from data. For more information, see the 5 min introductory video and preparation advice, and the course catalogue entry.

Are you up-to-date with the lecture log and tutorials? Ask questions on NB.

Lectures: Tuesdays 14:10–15:00, Appleton Tower LT1. Fridays 14:10–15:00, Appleton Tower 2.14.
First lecture: Tuesday, 16 September 2014.
Tutorials: Weeks 4–9 . Details and question sheets.
Lecturer: Iain Murray

No lecture: Friday, 3 October 2014 (lecturer at exam meeting).
Last lecture: probably Friday, 21 November 2014.

Information for PhD students wishing to audit the course.

Quick links: News — Help and feedback — Coursework and Assessment — Reading — Course notes

News

I email the whole class occasionally for important announcements, but you should check the course website regularly.

Selected news:

2014-10-20: assignment released.

2014-10-16: All tutorial sheets now on the website and class forum.

2014-10-15: No office hour on Wed 5 November. I'll ensure I'm early to the following Friday's lecture (as I usually am) to answer questions.

2014-10-14: I've updated the week 4 notes with a review of the product/chain rule, and better diagrams.

2014-10-10: Tutorial 1 answers, and sheet 3 posted.

2014-10-04: Second tutorial sheet posted.

2014-09-17: New experiment: lecture log.

2014-09-16: Please sign up for the class forum (and the class itself!).

Help and feedback

Written feedback will be given with the assessed assignment (see below). However, as with all courses, most of your time should be spent on wider study of the course. Apart from the assignment and exam marks, all other opportunities for feedback (described below) are “formative”: intended to help you develop skills and knowledge rather than directly affecting your mark. There are many ways to get such feedback, but they require you to engage with the class.

In each lecture we will discuss some of the examinable material, and review previous material. Ask questions, in person and on NB. The tutorials should be even more interactive, and a time to check that your problem solving is up to standard before the assignment and exam.

At times you may find that you need private feedback or help. If you would like detailed, personalized feedback on your answers to any questions from the tutorials or textbook, you can give me any of your unassessed written work at any time. It will be returned within 7 days. Ways to contact me include:

There is usually some time to catch me in person before the start of each Tuesday lecture.
You can drop by my office (IF 1.46) during office hour: Wednesdays 12:10–1pm, weeks 5,6,7,9,10 during Semester 1 (Not 5 Nov). For weeks 2 and 3, I'll hold office hour on 10am on Mondays 22 and 29 September. I’ll also try to be early to the Friday lecture to field any questions. If you can’t make these times, email me to ask for another.

Important exception: I do not meet students, or guarantee prompt replies, in the final week before the exam.

I find it really useful to hear how I can improve the delivery of the course. Please feel free to contact me directly with constructive criticism, as well as your questions. I used to elect class representatives for anonymous feedback, but they never contacted me. If you would like to send anonymous comments, send them via your year’s class reps, or in writing via the ITO. At the end of the course, please fill out the course questionnaire. BUT don't leave it until then to give me your feedback! Please email about any problems as soon as they arise.

Coursework and Assessment

Obtain and do the tutorial questions before each tutorial. Tutorials start week 4. Unlike the assignment, tutorial work is not used to decide your final mark. You may, and are encouraged to, discuss the tutorial work with your class-mates.

The assessed assignment was made available on 20 October and is due Thursday 20 November 2014, 4pm, by in-person submission to the ITO. Marks and feedback will be available from the ITO by 4 December, probably earlier, but not until after anyone with an extension has submitted.

Answering some of the assignment questions will require some programming in the language of your choice, the code itself will not be assessed. Programming issues special to this course will be discussed in lectures and tutorials, but you will not be taught a programming language. You must be prepared to write code on your own to take this course.

Past and mock IT exams are available. Informatics and the University also archive some past exam papers (accessible from within the University), but there are usually fewer IT papers available there. In previous years, the exam was in the December diet. New for 2014: the exam will now be in the main May diet. (Now only visiting students can take a December exam.)

By request, I created some optional Review Questions (2up).

Reading

The recommended course text is: Information Theory, Inference, and Learning Algorithms, David MacKay, CUP 2003. Some of the lecture notes make cross-references to this text, and some tutorial exercises will come from it. This book is freely available online, although I strongly advise buying a hard copy unless you have a good way to read e-books.

Those who like to absorb material through a series of theorems and lemmas may prefer Elements of Information Theory (Second Ed.), Cover and Thomas, Wiley 2006. Purchasing this book is not required.

You will need to know the mathematics in sections 1 and 3 of my cribsheet (2-up), which I made for another course. And have a thorough understanding of logarithms! You should also review expectations and summations (2-up) of random variables.

Shannon’s seminal paper (very readable) and UCSD’s video on the legacy of Claude Shannon.

Some programming notes relevant to the course.

You can ask questions about some of the readings on the class forum.

Course notes and outline

More information on the course and its motivation can be found in the official course catalogue entry and the original case for support.

Each item below corresponds roughly to a week of lectures, but I won’t stick exactly to the advertised schedule. The relevant chapters refer to the recommended course text.

You can ask questions about these and other notes on the class forum.

The lecture notes are slide sets with some additional denser notes. In the lectures I will present most of the material on the whiteboard instead of using the slides. There are “4-up” and “9-up” contact sheets of the slides to save paper when printing. If you want to format these differently you could tweak the make4up/make9up shell scripts that made them (or try the GUI print settings in your PDF viewer).

Introduction to Information Theory: digital communication, error correcting codes, representing data, compression, counting large sets.
The magnetic media inside hard-disks frequently corrupts bits. How is reliable storage possible? How should we go about making the most efficient use of the bits available?
Chapter 1, pp1–8, and pp14–15.
Slides, 4-up, 9-up.
Measuring and representing information. Probabilities over many variables. The typical set. Information content. Entropy.
What is “information”? Quantifying information and knowing its mathematical properties should tell us something about the resources we need to store it and move it around, and what can be achieved with the information we have.
Chapter 4 covers the same material. The notation and derivation of the Source Coding Theorem is slightly different from the slides. You won’t be asked to regurgitate definitions of things like T_m from the slides or H_δ from the book. You do need to understand and be able to apply the ideas.
Slides, 4-up, 9-up.
Notes on expectations and summations, 2-up.
An overview of the source coding theorem, 2-up.
Practical compression 1: symbol codes Symbol codes. Huffman codes.
How can we compress (close to) the entropy of a source in practice? We start with symbol-coding, which gives the answer in some restricted circumstances.
Chapters 2, 5.
Slides, 4-up, 9-up.
Practical compression 2: stream codes. Probabilistic and Human prediction. Arithmetic compression. Dasher.
We move onto state-of-the-art compression. How can we overcome all(!) the limitations of the symbol codes we have considered so far: use more powerful models and group large numbers of symbols together.
Chapter 6.
Slides, 4-up, 9-up (Updated on 2014-10-14).
Modelling and inference for stream codes: Beta and Dirichlet predictions. Models for text and images. PPM.
Information theory shows that compression and (Bayesian) probabilistic methods are intimately linked in theory. How can we build models for compression?
Chapters 3, 6.
Slides, 4-up, 9-up.
Communication channels and information: A framework for communication. Redundancy. Models of channels. Back channels. Information theoretic quantities.
Chapters 1, 2, 8, 9, 10.
Slides, 4-up, 9-up.
Noisy channel coding. Check digits. Block codes on extended channels; Hamming Codes. The noisy channel coding theorem. Erasure channels and digital fountain codes.
Chapters 1, 9, 10, 50.
Slides, 4-up, 9-up.
Theory review and sparse graph codes. Review of information theory proofs. Low-density-parity-check codes for the binary symmetric channel and belief propagation.
Chapters 10, 13, 26, 47.
Slides, 4-up, 9-up.
Sending less than the information content: rate distortion, hash codes.
If you can only send N/3 bits, can you do better than dropping 2N/3 bits of your file on the floor? How do web browsers alert you to bad sites without you needing a list of the bad sites, or telling a server which site you are visiting? What are some active research topics?
Chapter 10, 12. Slides, 4-up, 9-up.

All the slides/notes in a single PDF: 1-up, 4-up, 9-up.
In case you want to merge some slides yourself: pdftk can merge the slides, then optionally 4-up or 9-up.

This page maintained by Iain Murray.
Last updated: 2017/05/03 08:49:00

Home : Teaching : Courses : It