Information theory, Autumn 2011

Information theory describes the fundamental limits on our ability to store, process and communicate data, whether in natural or artificial systems. Understanding and approaching these limits is important in a wide variety of topics in informatics, including compression, communication and learning from data.

Lectures: Tuesdays and Fridays both at 14:00–14:50 Room 4.01, David Hume Tower. [map1map2]
Lecturer: Iain Murray
Tutorials: start week 3. Question sheets and other information.

Information for PhD students wishing to audit the course.

News

I email the whole class occasionally for important announcements. To get more regular updates, subscribe to this page’s RSS feed.

2011-11-21: Notes for final lecture (22 November) are below. If you haven’t already, please let me know what you thought of the course. (Hand in the form to the ITO, or fill in their online form.)

2011-11-08: Exam information updated.

2011-11-07: Final Tutorial Sheet 6 posted. (And notes updated.)

2011-11-02: Assignment posted. Due November 24th, 4pm. Start now!

Feedback

You can drop by my office IF 1.46 during office hour: Mondays 11am–noon. If you can’t make this time, email me to ask for another.

If you feel uncomfortable sending constructive criticism directly to the lecturer (although really you can), you can send anonymous comments via the class representative. For 2011 our generous volunteer is: Balazs Szigeti, s1151503 at the usual sms.ed.ac.uk.

If you would like detailed, personalized feedback on written answers to any questions from the tutorials or text book, you can give them to me at any time.

At the end of the course, please fill out the ITO course questionnaire. BUT don't leave it until then to give me your feedback! Please email about any problems as soon as they arise.

Coursework and Assessment

Obtain and do the tutorial questions before each tutorial. The first tutorial is in week 3. Unlike the assignment, tutorial work is not assessed. You may, and are encouraged to, discuss the tutorial work with your class-mates.

The assessed assignment was made available on 2 November and is due Thursday 24 November, 4pm.

Answering some of the assignment questions will require some programming in the language of your choice, the code itself will not be assessed. Programming issues special to this course will be discussed in lectures and tutorials, but you will not be taught a programming language. You must be prepared to write code on your own to take this course.

Informatics makes some past exam papers available (accessible from within the University). As there were no past papers last year, I posted a mock exam.

Reading

The recommended course text is: Information Theory, Inference, and Learning Algorithms, David MacKay, CUP 2003. Available for purchase in hard-copy or freely downloadable as a PDF. Some of the lecture notes make cross-references to this text, and some tutorial exercises will come from it.

Those who like to absorb material through a series of theorems and lemmas may prefer Elements of Information Theory (Second Ed.), Cover and Thomas, Wiley 2006. Purchasing this book is not required.

You will need to know the mathematics in sections 1 and 3 of my cribsheet (2-up), which I made for another course. And have a thorough understanding of logarithms!

Shannon’s seminal paper and UCSD’s video on the legacy of Claude Shannon.

Some programming notes relevant to the course.

Course notes and outline

More information on the course and its motivation can be found in the official course catalogue entry and the original case for support.

Each item below corresponds roughly to a week of lectures. As the course progresses, this list may change slightly. The relevant chapters refer to the recommended course text.

The lecture notes are slide sets with some additional denser notes. In the lectures I will present much of the material on the whiteboard instead of using the slides. There are “4-up” and “9-up” contact sheets of the slides to save paper when printing. If you want to format these differently you could tweak the make4up/make9up shell scripts that made them (or try the GUI print settings in your PDF viewer).

Where updated slides are not available yet, last year’s page will give you an idea of what will be covered.

All slides/notes in a single PDF: 1-up, 4-up, 9-up.

  1. Introduction to Information Theory: digital communication, error correcting codes, representing data, compression, counting large sets.
    The magnetic media inside hard-disks frequently corrupts bits. How is reliable storage possible? How should we go about making the most efficient use of the bits available?
    Related: Chapter 1. Starting to look at material from chapters 2 and 4.
    Slides, 4-up, 9-up.
  2. Measuring and representing information. Probabilities over many variables. The typical set. Information content. Entropy.
    What is “information”? Quantifying information and knowing its mathematical properties should tell us something about the resources we need to store it and move it around, and what can be achieved with the information we have.
    Chapters 2, 4.
    Slides, 4-up, 9-up.

  3. Practical compression 1: symbol codes Symbol codes. Huffman codes.
    How can we compress (close to) the entropy of a source in practice? We start with symbol-coding, which gives the answer in some restricted circumstances.
    Chapters 2, 5.
    Slides, 4-up, 9-up.

  4. Practical compression 2: stream codes. Probabilistic and Human prediction. Arithmetic compression. Dasher.
    We move onto state-of-the-art compression. How can we overcome all(!) the limitations of the symbol codes we have considered so far: use more powerful models and group large numbers of symbols together.
    Chapter 6.
    Slides, 4-up, 9-up.

  5. Modelling and inference for stream codes: Beta and Dirichlet predictions. Models for text and images. PPM.
    Information theory shows that compression and (Bayesian) probabilistic methods are intimately linked in theory. How can we build models for compression?
    Chapters 3, 6.
    Slides, 4-up, 9-up.

  6. Communication channels and information: A framework for communication. Redundancy. Models of channels. Back channels. Information theoretic quantities.
    Chapters 1, 2, 8, 9, 10.
    Slides, 4-up, 9-up.

  7. Noisy channel coding. Check digits. Block codes on extended channels; Hamming Codes. The noisy channel coding theorem. Erasure channels and digital fountain codes.
    Chapters 1, 9, 10, 50.
    Slides, 4-up, 9-up.

    The second half of these notes were actually covered in week 8. And the ‘week 8’ notes slipped into week 9.

  8. Theory review and sparse graph codes. Review of information theory proofs. Low-density-parity-check codes for the binary symmetric channel and belief propagation.
    Chapters 10, 13, 26, 47.
    Slides, 4-up, 9-up.

  9. Sending less than the information content: rate distortion, hash codes.
    If you can only send N/3 bits, can you do better than dropping 2N/3 bits of your file on the floor? How do web browsers alert you to bad sites without you needing a list of the bad sites, or telling a server which site you are visiting? What are some active research topics?
    Chapter 10, 12. Slides, 4-up, 9-up.


This page maintained by Iain Murray.
Last updated: 2012/03/21 17:33:06


Home : Teaching : Courses 

Informatics Forum, 10 Crichton Street, Edinburgh, EH8 9AB, Scotland, UK
Tel: +44 131 651 5661, Fax: +44 131 651 1426, E-mail: school-office@inf.ed.ac.uk
Please contact our webadmin with any comments or corrections.
Unless explicitly stated otherwise, all material is copyright © The University of Edinburgh