Information theory: lecture log

Lecture 3, Tuesday w2, 2014-09-23

Main lecture points:

Block coding idea: identify a list of possible files with almost all the probability mass under a model. Just use a fixed width code for them. (And deal with the rest any way you like, they happen rarely, so it doesn’t matter.)

The law of large numbers says an average of samples will be close to their mean.

Central limit theorem (CLT): Close to the mean, the sum or mean of N independent variables with bounded mean and variance tend towards being Gaussian distributed as N increases.

Chebyshev’s inequality: bounds tail probabilities of any distribution far from the mean.

Information content: intuition, and definition, log(1 / p) = − log(p).

Check your progress

Get familiar with information contents:

How do you convert between information contents in bits, nats, or bans?

The submarine game (MacKay, p71): can you show that if you keep asking questions until you identify the submarine, the total information content experienced is always 6 bits? MacKay shows this on p72.

Applying the law of large numbers, CLT, and/or Chebyshev’s:

Imagine you have a machine learning system that makes a real-valued prediction (e.g., temperature). You measure the absolute error made on each case in a large test set of size N, and compute the mean absolute error m̂. This estimator m̂ is a random variable, it depends on the particular test set that you gathered. If you gathered a new test set, you’d get a different estimate. What can you say about how m̂ is distributed (and under what assumptions)? It may be useful to talk about the true mean absolute error m, and its variance σ², which you might also have to estimate.

That is: do you know how to put a standard error bar on an estimate and know what that means? If you do any experimental work (including numerical experiments) in your project, you’ll probably want to put error bars on some estimates.

For keen people

Again, it’s a good idea to try reproducing plots from the slides. Plotting graphs is a useful research skill. And if you have to implement something, it really tests whether you know where the plot came from.

This page maintained by Iain Murray.
Last updated: 2014/09/24 11:57:58

Informatics Forum, 10 Crichton Street, Edinburgh, EH8 9AB, Scotland, UK
Tel: +44 131 651 5661, Fax: +44 131 651 1426, E-mail: school-office@inf.ed.ac.uk
Please contact our webadmin with any comments or corrections. Logging and Cookies
Unless explicitly stated otherwise, all material is copyright © The University of Edinburgh

Lecture 3, Tuesday w2, 2014-09-23

Check your progress

Recommended reading

For keen people