This page is for code snippets and programming notes that may be helpful for the Informatics Information Theory course.

Minimal implementations sufficient to help answer the tutorial questions.

Matlab/Octave implementation: huffman.m

Python implementation: huffman.py

Haskell implementation: huffman.hs

Theo Varvadoukas, a student in the 2012–13 class, coded up the Sardinas–Patterson algorithm in Python: sardinas_patterson.py.

This code allows you to quickly check if a symbol code is uniquely decodable, whether or not it is a prefix code. The Kraft–McMillan inequality can rule out a code, but this algorithm actually checks it. An alternative to checking non-prefix codes is to only use prefix codes: if the lengths of the codewords satisfy the Kraft–McMillan inequality, then why not construct a prefix code with those lengths?

The log-Gamma function can be useful for evaluating the log of large
factorials (ln(*x*!)=gammaln(*x*+1)). For many purposes in this course
you can approximate the factorials you encounter, which may give more insight
than evaluating them. However, comparing to a numerically ‘exact’
answer can be useful too. Here are some implementations that I’m aware of:

**C:** `lgamma` is provided in `math.h` in C99 and
POSIX.1-2001 versions of C.

**C:** or use `gammaln` in `util.c` in
lightspeed
(older
version with more liberal license)

**C:** or use `logfactorial` / `logGamma` in `random.c` in
BayeSys

**Matlab/Octave:** `gammaln`

**Python+numpy+scipy:** `scipy.special.gammaln`

**R:** `lgamma`

There is probably a function somewhere for your favourite programming language. If not, you could port or call one of the C implementations.

If you need to compute log gamma differences of very close numbers, see this blog post.

Most programming languages come with a uniform random number generator that provides samples uniformly between 0 and 1. Uniform real-valued samples are often used as a basis for sampling from non-uniform distributions, although MacKay (p118) argues for a using raw bits in an arithmetic encoder instead.

**C:** The C standard library function `rand` provides a sampler for integers
over a large range. The following two lines provide a `u01rand` ‘function’
for approximately generating Uniform[0,1] samples:

#include <stdlib.h> #define u01rand() ((double)rand()/RAND_MAX)

*There are known problems with the standard C generator though, and it should be avoided in
serious scientific work.* Really, I have actually been bitten by the C
random number generator, it *is* dangerous! A better generator and
several standard distributions are provided by the `random.{h,c}` files
in BayeSys,
which you could drop into your project. The GSL
also provides random number routines, amongst many other things.

**Other languages (recommended for exploratory code):** Matlab
provides `rand` and `randn` for uniform and Gaussian variates. Its
statistics toolbox, for which you might not have a license, provides various other
generators. If you use Octave (or R, or scipy, ...) instead, many of the same
functions are available for free.

Being able to quickly draw graphs is an important skill that can help with the understanding of new topics. Many languages provide convenient plotting functionality, for example, Matlab/Octave, R and Python+matplotlib. If your favourite programming language does not, an alternative is to write out your data to text files and plot it using a package like gnuplot.

**Central limit theorem:** a question in
tutorial 1 was to add up 12 uniform[0,1] numbers many times, plot a
histogram of the results and compare to a central limit theorem prediction.
Here are example solutions in:
a) Matlab/Octave,
b) C with plotting done by gnuplot.

This page maintained by Iain Murray.

Last updated: 2013/10/26 12:50:41

Informatics Forum, 10 Crichton Street, Edinburgh, EH8 9AB, Scotland, UK
Tel: +44 131 651 5661, Fax: +44 131 651 1426, E-mail: school-office@inf.ed.ac.uk Please contact our webadmin with any comments or corrections. Logging and Cookies Unless explicitly stated otherwise, all material is copyright © The University of Edinburgh |