, Rm 1.44 Informatics Forum. <email@example.com>, Gabriele Schweikert
Daniel Trejo-Banos D.Trejo-Banos@sms.ed.ac.uk
Last Update: 07 Mar 2013
Where possible I shall post lecture notes the evening after the lecture, so that (inevitable) mistakes spotted in the class can be corrected. Please do not email me immediately after the class asking for the lecture slides.
Additional reading will be posted before or after the lecture as appropriate.
2012 notes are available for reference. Notice however the change in the course content.
Lectures are held in Room 1.2, 22 Buccleuch Place at 11.10am until 1pm. Tutorials will be at 9am on 30/01 and 13/03, room TBC.
This lectures gives a quick introduction to the field of bioinformatics, and a review of the basic rules of probability. These are foundational basic blocks which are needed to understand the more advanced material discussed later, which in turn underpins many current bioinformatics algorithms. We then proceed to introduce the main ideas of statistical testing.
For further material on basic probability and distributions, you may find
the following texts useful: Grimmet and Stirzaker, Probability and random processes (OUP), C.M. Bishop, Pattern Recognition and Machine Learning, Springer, particularly section 1.2. You should be able to find these books in the library and you may be able to find pdfs for them on the web. A great book which is available on the web is Information Theory, inference and learning algorithms
by David Mackay. Chapter 23 contains a review of useful distributions, including the ones mentioned in the lecture. There are countless books on testing, but to be fair Wikipedia is an excellent source.
This lecture introduces dynamical systems and how they can be useful in modelling certain situations in biology. A good reference in general for dynamical systems and much of machine learning is David Barber's new book, available here
(this is much more detailed than we need). A very good discussion of HMMs in sequence analysis problems is given in Durbin et al's book, Biological sequence analysis (on the reading list).
Tutorial 1, 9am, AT5.08. Questions
This lectures gives a number of example applications for next generation sequencing technologies. It introduces the experimental method, including library preparation, (Illumina) sequencing and basic analysis steps.
We will now focus on a special application of NGS: ChIP-Seq, which is used to profile protein-DNA interaction. Important analysis steps include peak calling, normalization, input correction and differential binding analysis.
Lab Session (13 Feb 2013) Introduction to NGS analysis tools, Daniel Trejo Banos AT4.12. Instructions
Innovative learning week 18-25 Feb
As part of the innovations, you get no teaching this week...
Guest Lecture (27 Feb 2013), Prof Mark Blaxter, School of Biological Sciences, Genome Sequencing
Martin and Wang, Next-generation transcriptome assembly,
Nature Reviews Genetics, 2011
TopHat (spliced alignment):
Trapnell et al., Bioinformatics, 2009
Cufflinks (ab-inito transcriptome assembly and quantification):
Trapnell et al., Nature Biotechnolgy, 2010
DESeq (Differential Expression Analysis):
Anders and Huber, Genome Biology, 2010
Standards and Guidlines
Landscape of transcription in human cells
Tutorial 2 (13 Mar 2013, 9am, AT5.08) Questions
Guest Lecture (13 Mar 2013) Guest Lecture from Dr Grzegorz Kudla, MRC Human Genetics Unit, Western General Hospital
Guest Lecture (20 Mar 2012) Guest Lecture from Dr Chris Larminie, GSK
This lecture introduced the uses of bioinformatic tools in drug discovery within an industrial R\& D environment.
Aims and Objectives
Bioinformatics is at the interface between two of the most influential
scientific fields. An appreciation of computational and biological sciences,
in particular the terminology employed in both fields, is essential for
those working at such an interface. In this course, we aim to cover the
1. The concepts of computer science that relate to problems in biological
2. Commercial and academic perspectives on bioinformatics.
3. The impact of bioinformatics on the methodologies used in biological
4. The influence biological science has on computing science.
An undergraduate degree in computing or mathematical
sciences will be useful, particularly some exposure to machine learning.
However, the course would also be suitable for an individual
sciences with some programming experience and some background in basic
statistics/ machine learning.
This course will be assessed on assigned coursework and an exam (30/70 split).
The course work for this year will be released on 22/02 and is due back on 08/03 4pm. You can expect feedback in general within two weeks. Please carefully read the Schools regulation regarding late coursework and academic contact. Solutions have to be submitted to the ITO following the usual informatics procedure.
Jones N.C. and Pevzner P. (2004) An Introduction to Bioinformatics Algorithms, MIT Press
Durbin R., Eddy S., Krogh A. and Mitchinson G. (1998) Biological sequence
analysis: Probabilistic models of proteins and nucleic acids. Cambridge
University Press. ISBN 0-521-62971-3.
Baldi P. and Brunak S. (2001) Bioinformatics: The Machine Learning approach. MIT Press
For more details on pattern recognition and machine learning
Bishop C.M. (2006) Pattern Recognition and Machine Learning, Springer.
Duda R.O., Hart P.E. and Stork D.G. (2000) Pattern Classification, Wiley Interscience.
A good textbook for Molecular Biology is:
Alberts B. (2002).
Molecular Biology of the Cell
An intro level guide to programming PERL for Bioinformatics problems is:
Tisdal J. (2001)
Beginning Perl for Bioinformatics
Examples from the literature will be used throughout the lectures.
Biological Data Analysis Start Points:
Programming Tools and Libraries
BioJava under DICE: