Information theory: lecture log

Lecture 11, Friday w6, 2014-10-24

Finished off source coding loose end:

Predictions can’t just count how often things happened in a context.

Small contexts: model’s predictions will be too vague.
Large contexts: won’t have enough data to get good counts. No matter how much data we have, it makes sense to consider contexts of multiple sizes.

Prediction by Partial Match (PPM). One simple way (frequently combined with arithmetic coding) to back off between contexts.

Started noisy channel coding:

Communication: compress (remove redundancy), encode so robust to noise (add redundancy again!), send over channel, decode, decompress.

Discrete memoryless channels:

Discrete: finite number of possible input and output symbols
Memoryless: nothing that happens to the channel affects its properties (unlike a neuron that can’t fire any more after sustained activity, or a disk that flips more bits if too many 1’s are stored in a row).
Can have different input and output alphabets (including different sizes).
Synchronized: we know the sequence of every received symbol. None are deleted, removed without knowing we lost a symbol, no extra symbols are inserted. The order is maintained.

Example channels: BSC, BEC, Z.

Channel defines Q matrix of transition probabilities. We don’t usually get to choose that. We get to set vector of input probabilities: p_x.

Q and p_x specify whole joint distribution of x and y. Marginal distribution of output is p_y = Qp_x.

Joint entropy of channel H(X, Y) < H(X) + H(Y). Equality would mean channel is of zero use!

The mutual information I(X; Y) = H(X) + H(Y) − H(X, Y) tells us how much information is shared between the input and output.

Check your progress

Why do we compress (remove redundancy) if we’re just going to add redundancy again?

What’s the difference between a deletion channel (which is hard to deal with), and an erasure channel (which is fairly easy)?

Why is p_y = Qp_x?

We’re up to around the 13th slide in the ‘week 6’ slides. Mark anything that’s unclear or that needs expanding on NB.

This lecture covered around pp146–148 of the textbook.

Extra reading

If you’d like to know how compressors like gzip work (non-examinable) read MacKay p119, but also do Exercise 6.13 and then read its answer.

If you want to read ahead, look at Chapter 8 and the rest of Chapter 9.

This page maintained by Iain Murray.
Last updated: 2014/10/27 10:29:41

Informatics Forum, 10 Crichton Street, Edinburgh, EH8 9AB, Scotland, UK
Tel: +44 131 651 5661, Fax: +44 131 651 1426, E-mail: school-office@inf.ed.ac.uk
Please contact our webadmin with any comments or corrections. Logging and Cookies
Unless explicitly stated otherwise, all material is copyright © The University of Edinburgh