Lecture 12, Tuesday w7, 2014-10-28 ================================== We reviewed discrete memoryless channels and discussed all the many different conditional, marginal and joint probabilities and entropies. The definitions are all summarized at the end of the week 6 slides. If you draw the block diagram, you can read off the three expressions for mutual information. Some grunt work: you should know all of these definitions and how they fit together. **The capacity:** the maximum possible mutual information for a channel, achieved by the *optimal input distribution*. The mutual information is positive: * Proof: compare $P(x,y)$ and the independent distribution $P(x)P(y)$ with KL. The result drops out by Gibbs inequality. * Implication: observing data $y$, *on average*, cannot increase our uncertainty about any other quantity. Check your progress ------------------- We just started the 'week 7' slides. Mark anything that's unclear or that needs expanding on NB. You can also review all of the quantities for dependent variables in Chapter 8 of MacKay. (We won't use the three-term conditional mutual information $I(X;Y|Z)$ in this course.) There are exercises to check your understanding.