CS4/MSc Distributed Systems
Important
All students should have received their marks for the coursework
by now.
Basic Information
This module runs in the first semester, on Monday and Thursday at
15:10 - 16:00, taught by Allan Clark.
Lectures will take place in the Lecture Theatre in the Hugh Robson
Building.
Lecture Slides
Below are the slides for each of the parts of the course individually
but you can save yourself some time and
download them all together as a single file
by clicking:
HERE
-
Course Information
-
Introductory Material
-
Fundamental Concepts
-
Time and Global States
-
Coordination and Agreement
-
Distribution and Operating Systems
-
Peer-to-Peer Systems
- Security
- Summary
Related Reading
- Interesting
blog post on debugging a distributed system
- Peerson
a peer-to-peer social networking site
- Breaking
a poker site (from 1999 but still relevant)
- Article
on encryption and cloud computing
Clarifications
Any student queries which I believe will be of general use to the class
I will place here (with perhaps a useful answer). If you have a question
of your own email me
a.d.clark@ edinburgh, academic, uk (
ed.ac.uk)
Course Description
A distributed system is broadly categorised as a collection or network
of loosely coupled, autonomous computers that can communicate with
each other and execute logically separate computations, though these
may be related to concurrent computations on other nodes.
- The nodes are relatively loosely coupled.
- Each node is a self-contained autonomous computer with its own
peripherals.
- The system can survive various categories of node and network
failures.
- The nodes may execute logically separate computations, though
these may be related to concurrent computations on other nodes.
- The system is asynchronous.
Distributed systems have become pervasive---many applications now
require the cooperation of two or more computers--yet the design and
implementation of such systems remain challenging and complex tasks.
Difficulties arise from the concurrency of components, the lack of a
global clock and the possibility of independent failure of components.
Moreover designs must aim to provide inter-operability, transparency
and autonomy.
The emphasis of this module is on gaining understanding of the
principles and concepts that are used to design distributed systems
and experience of software platforms which underpin their development.
Coursework
- The course work assignment for Level 10 students is
here
- The course work assignment for Level 11 students is
here
The lecture slides used to assign the coursework are available
here
-
Three will be a course work assignment given out for both level 10 and
level 11 students on Monday October 8th.
-
The deadline for level 10 students will be 4pm Thursday November 8th
-
The deadline for level 11 students will be 4pm Thursday November 22nd
Submission
Submission will be made using the
submit command:
The < filename > can be a directory or a single file.
Marks and feed back for your coursework is intended to be returned within 3
weeks from the submission date:
- For level 10 students: November 29th
- For level 11 students: December 13th
Clarifications
Q. I have found a topology in which, after a link failure, the
algorithm does not converge to the optimal routes
A. That is fine, that's a known problem and to be expected, write
up your findings in your submission.
Q. I can even get the algorithm itself to loop indefinitely
A. Excellent, write up what you have found in your submission
Q. In the handout you say that we can avoid doing the text
parsing and you give the following syntax for inputting
nodes
input_nodes.add(new InputNode("p1", {1}));
but that doesn't compile with Java.
A. Yes sorry about that, instead you have to declare a separate
variable, you can do this instead:
int addresses_p4[] = {4,5};
input_nodes.add(new InputNode("p4", addresses_p4));
If you wish you can also put a complaint about Java's syntax
in the comments.
Q. The pseudocode declares that if a node receives a table with
some information about an address for which it already has
a row, but that that the new information is determined to be
“better” then the new information should be
placed in the table, but should it keep the old information?
A. No, the pseudocode should probably have read
“replaced”. To be clear, there should only ever
be at most one entry for each address in any one routing
information table at any one time.
-
Q. Can I re-submit using the submit command?
A. Yes, at a terminal type
man submit for more
information on the
submit command, but the relevant
section states:
Should you find on reflection that you wish to amend your
answers to an exercise, you may resubmit any files or
directories you have already submitted.
The original submitted file or directory will be overwritten.
Note that if you wish to change a file within a submitted
directory, you must resubmit the whole directory.
When resubmitting, you will be asked to confirm that you
wish to overwrite your original submission.
-
Q. Can I assume that the order of the input commands will
always be all the 'node' commands, followed by all the 'link'
commands followed by the 'send' commands?
A. Yes, this practical is not intended to test your input file
parsing abilities so I've tried to make it as simple as
possible.
-
Q. Can I always assume that the input files will be valid?
A. Yes, in particular you can assume that any 'send' or 'link'
commands will only reference node names which have been
previously defined by a 'node' command.
-
Q. Can I use multiple source files?
A. Yes, that's absolutely fine.
Grading
Grading: The final exam counts for 75%, and the
coursework (and project at level 11) count for 25%.
Exams
Note that there has been a substantial revision of
the module in the academic year 2001/2, and a minor one in
2002/3. Nevertheless you should find some questions on past exam
papers useful for revision purposes.
Course Texts
This year there is no required text book. The course however largely
follows portions of the textbook:
George Coulouris, Jean Dollimore and Tim Kindberg,
Distributed Systems: Concepts and Design
5th Edition
There is also a largely relevant
4th Edition
which you may find cheaper.
Our part two "Fundamentals" will comprise a brief look
at several chapters: Fundamental Models, Networking, Interprocess
Communication.
The remaining parts will stick mostly with the chapters of the book:
Time and Global State, Coordination and Agreement,
Operating System Support, Peer-to-Peer and Security.
Those as with any course there will be parts on the course not in the book
and vice-versa.
Other related texts
Include:
-
Andrew S. Tanenbaum and Maarten Van Steen, Distributed
Systems: Principles and Paradigms, Prentice Hall, September
2001 web site
-
Nancy A. Lynch, Distributed Algorithms, Morgan Kaufmann, 1996
-
Andrew S. Tanenbaum, Computer Networks, 3rd ed., Prentice- Hall, 1996.
-
R. Chow and T. Johnson, Distributed Operating systems and Algorithms,
Addison-Wesley, 1997.
These pages will be updated regularly as the course progresses in
particular with a copy of the lecture slides.
Last Updated: 3rd December 2012 -- Allan Clark