Computer Science 4: Distributed Systems

Lecture Log for Distributed Systems

Lecture 1:
- Architectures and Networking Today's lecture served as an introduction to the module. I discussed some of the key issues which face the designers of distributed systems, discussed some of the achievements of the last twenty years in facing these challenges and highlighted some topics which we will be considering later in the course.
  I then went through some architectural models which are intended to give you an idea of the different ways in which distributed systems can be organised. Although client/server and its variants still dominates as the common architecture for distributed systems, other schemas such as peer-to-peer are starting to appear. During the review of these architectures I again highlighted topics which we will return to later in the course.
  At the end fo the lecture I gave a brief summary of the networking concepts underlying message passing in distributed computations. The effect of message passing is to introduce a delay, which is equal to the latency and the transmission delay. In general the more complex (and disparate) the topology of the route the message has to take, the greater the delay. I explained some of the devices and software components that a message may have to pass through, each of which will introduce an associated delay. Thus the latency within a single ethernet is typically less than 1 millisecond, whereas the latency between two arbitrary hosts on the Internet will be 300-600 milliseconds.
  I talked about the IP protocol, IP addresses and domain names which are now almost universally used (the exceptions being WAP and specialised multimedia streaming protocols). I also discussed IP version 6 (IPv6) and MobileIP which offers support for mobile hosts.
- Lecture 1
- Recommended reading: Chapter 1, Sections 2.1 and 2.2 and Chapter 3 of CDK3
Lecture 2:
- Inter-process Communication Today's lecture was focussed on communication between processes as opposed to between hosts. Here the important concepts are ports and sockets. Java provides good programming abstractions of sockets, making it quite easy to write processes communicating by either datagrams (UDP) or streams (TCP), as the example programs show.
  I also discussed data marshalling and multicast. These features are often managed by middleware, rather than applications, as we will see in the next lecture. Nevertheless is it important to be aware of them.
- Lecture 2
- Recommended reading: Chapter 4 of CDK3
- Programs:
Lecture 3:
- Remote Invocation and Distributed Objects In today's lecture I went over the programming models found in distributed applications. In particular he focussed on remote procedure calls and remote method invocation, using Sun RPC as an example of the former and JavaRMI as an example of the latter. The supporting elements needed for each are quite similar. Stubs/proxies accept the calls/invocations localling and transparently carrying out the marshalling of the call into a request message; messages are passed between communication modules; a dispatcher passes an incoming request with the appropriate stub/skeleton which unmarshalls the data and makes a local call/invocation; and similarly for the reply.
  The interface, which specifies the procedures/methods of the remote service is used to generate these supporting elements, as well as providing the means for agreement between the client and server implementors on the acceptable forms of interaction.
- Lecture 3
- Recommended reading: Chapter 5 of CDK3 (section on Event notification optional)
Lecture 4:
- CORBA and CORBA IDL Today we went on to consider CORBA in more detail than we did RMI. In particular today I talked briefly about the programming model for CORBA and how it conforms to the generic structure for remote method invocation described in the previous lecture. I then went on to talk about IDL, the CORBA interface definition language in some detail. I also explained the mapping to Java and the use of the idlj compiler to generate stubs, skeletons and other associated classes.
  The development process I outlined today was for static invocation - I will (briefly) discuss dynamic invocation next week. At the end of the lecture I ran out of time for explaining the POA (portable object adaptor) but I'll cover this next time.
- Lecture 4
- Suggested activity: Java Tutorial on using Java IDL
Lecture 5:
- CORBA and Java The focus of today's lecture was the Java view of CORBA remote invocation. Using simple example programs I showed how the ORB is initialised; setting up a POA and using it to create objects from implementations; and the use of the naming service to register and retrieve objects. I also talked about the alternative ways of obtaining object references (factory objects, strings, resolve_initial_references). I explained the structure of the naming service. At the end of the lecture I ran out of time for explaining the "tie" mechanism, but I'll cover this next time. The two final examples showed the use of a callback object and the use of the "tie" mechanism when you need a servant to inherit from a superclass which is not the skeleton.
- Lecture 5
- Suggested activity: take a look at the example programs in here, most of which were discussed in today's lecture.
Lecture 6:
- CORBA in detail In the lecture today I discussed the CORBA specification, in terms of the components it describes and the services they provide. The role of the implementation repository in binding persistent object references and the role of the interface repository in dynamic invocation were explained. At the end of the lecture I ran out of time for explaining the role of an IIOP gateway when a CORBA client or server is implemented as an applet, but I'll cover this next time. I also outlined the role of an IIOP gateway when a CORBA client or server is implemented as an applet. This concluded the material on CORBA.
- Lecture 6
- Suggested activity: take a look at the persistence example program discussed in the lecture and available from this page.
- Suggested reading: Chapter 17 in CDK
Lecture 7:
- CORBA in detail (continued) In the lecture today I discussed the implementation of a persistent server. I also outlined the role of an IIOP gateway when a CORBA client or server is implemented as an applet. This concluded the material on CORBA. The rest of the lecture was spent on Q & A.
- Lecture 6
- Suggested activity: take a look at the persistence example program discussed in the lecture and available from this page.
- Suggested reading: Chapter 17 in CDK
Lecture 8:
- DCOM and XML Web Services In today's lecture I gave an overview of DCOM, Microsoft's rival for CORBA, focussing the ways in which DCOM differs from CORBA. The principle difference is the completely different ways in which objects are treated in the two systems. In CORBA the objects are the primary entities, and clients obtain object references in order to access services. In contrast, in DCOM objects are not directly accessible to clients. Instead they are viewed through interfaces, each interface providing access to some subset of the object's methods.
  I also talked about DCOM's reference counting protocol which aims to address the problem of knowing when an object instance may be regarded as suitable for garbage collection. Finally I discussed more recent moves towards XML Web Services.
- Lecture 7
- Suggested reading: DCOM Technical overview available here
Lecture 9:
- Physical and Logical Clocks Today we considered issues of timing and synchronisation in distributed systems. Timing is a difficult issue in distributed systems because each host will have its own clock, subject to drift. Synchronisation algorithms aim to produce agreement between clocks but can only do so up to some degree of inaccuracy because of message propagation delays. Cristian's algorithm and the Berkeley algorithm (the pull and push mechanisms) were discussed in some detail.
  For some applications definite timing is not important but rather relative timing and indications of causality. The happened-before relation captures quite an intuitive notion of causality with respect to local events and message sending and receiving. Logical clocks count the events on processes, each process adjusting the count on the basis of messages from other processes when a receive event occurs. Some problems of incomparability of events based on logical clocks are solved when vector logical clocks are used instead.
- Lecture 8
- Suggested reading: Chapter 10 in CDK
Lecture 10:
- Global States In today's lecture we saw the application of some of the concepts developed in the previous lecture, to the practical problem of finding global states and evaluating their properties.
  For any arbitrary process we can record a history as a sequence of local events and from these we can form a global history as the union of local histories. To view states at particular times we need to truncate the histories appropriately - form a cut. A cut is termed consistent if it respects the happened-before relation in the sense that every event that happened-before an event in the cut is also in the cut. Once a global state is defined we can go on to consider runs, linearization and properties on them.
  In the latter half of the lecture I explained one algorithm for finding global states - Chandy and Lamport's algorithm for global snapshots. This algorithm does not use logical clocks and has the advantage that normal processing can continue whilst the snapshot is in progress. The algorithm makes use of marker messages which trigger processes to record their local state and separate messages within a channel into those before the cut and those after. The second algorithm was Marzullo and Neiger's algorithm for post hoc analysis of executions, such as used in distributed debugging. This does use vector clocks, to reconstruct globally consistent states for recorded histories from each process.
- Lecture 9
- Suggested reading: Chapter 10 in CDK
Lecture 10:
- Distributed Mutual Exclusion Today's lecture focussed on the problem of mutual exclusion in distributed systems. Mutual exclusion is needed when a shared resource can only support access from one process at a time. In a distributed system the mutual exclusion must be achieved via message passing. Four algorithms were discussed
  - The central server algorithm
  - The ring-based algorithm
  - Ricart and Agrawala's algorithm
  - Voting modifications of Ricart and Agrawala's algorithm
  Each of these was considered with respect to the safety, liveness and ordering properties as well as with respect to performance/bandwidth consumption.
  In the presentation of the algorithms we assumed that all messages were delivered in the correct order without duplications, corruptions or losses. We also assumed that all processes were reliable. At the end of the lecture I discussed the degree of fault tolerance supported by each algorithm.
- Lecture 10
- Suggested reading: Chapter 11 in CDK
Lecture 11:
- Failure Detection and Leader Election There are many situations in which a group of processes in a distributed system need to coordinate their behaviour and do so via a single process. When processes may fail it is necessary that the group is able to reconfigure itself with a new coordinator if the current coordinator crashes. This is the role of election algorithms. Of course, in order for such an algorithm to be run, other processes must be able to detect when a process has crashed more or less reliably.
  In today's lecture I talked about reliable and unreliable failure detectors, which give judgements of unsuspected and failed, or unsuspected and suspected respectively. Assuming that failures can be detected I then talked about election algorithms, going through two algorithms: the ring-based algorithm and the more robust bully algorithm in some detail. At the end of the lecture I discussed the alternative approach of group formation, as exemplified by the invitation algorithm.
- Lecture 11
- Suggested reading: Chapter 11 in CDK
Lecture 12:
- Failures and Fault Tolerance At the start of the lecture today, I classified failures as omission, arbitrary or (in synchronous systems) timing failures. Both omission and arbitrary failures in channels are handled quite well by protocols. For processes, the algorithms which we have considered so far have handled process omission failures (crashes), if they handled failures at all. Unfortunately arbitrary process failures are much more difficult to cope with.
  I discussed several fault tolerance mechanisms, primarily aimed at process omission failures. In particular I discussed the use of recoverability in transactions. Here a set of primitive operations on a distributed entity are treated as atomic, and must be either all committed or all aborted. Use of permanent storage and a commit protocol allow a group of servers, under the guidance of a coordinator, to agree to all take the action appropriate for the group.
  Finally I discussed consensus and related problems, in which we aim to deal with processes making arbitrary failures, i.e. continuing to operate but sending incorrect values. I explained that the problem has no solution which is guaranteed to have the termination, agreement and integrity properties in an asynchronous system. Even in a synchronous system the number of faulty processes f in a set of N processes must be such that N > 3f. Furthermore solution relies on f+1 rounds of messages. This can be improved on if signed messages are used.
- Lecture 12
- Suggested reading: relevant parts of Chapters 11,12 and 13 in CDK
Lecture 13:
- Failures and Fault Tolerance (continued) I discussed consensus and related problems, in which we aim to deal with processes making arbitrary failures, i.e. continuing to operate but sending incorrect values. I explained that the problem has no solution which is guaranteed to have the termination, agreement and integrity properties in an asynchronous system. Even in a synchronous system the number of faulty processes f in a set of N processes must be such that N > 3f. Furthermore solution relies on f+1 rounds of messages. This can be improved on if signed messages are used.
- Lecture 12
- Suggested reading: relevant parts of Chapters 11,12 and 13 in CDK
Lecture 14:
- Multicast and Group Communication The lecture today was concerned with multicast, also known as group communication. In particular I focussed on the forms of multicast needed to support fault-tolerant replication which will be described in the next lecture. Active replication requires reliable totally ordered multicast, whereas passive replication requires view synchronous group communication.
  In order to achieve a multicast we must first know who is in the group and then send a message to each of those processes. Group membership is generally managed by a group membership service, which as well as allowing processes to join and leave, will generally include a failure detector to exclude unreachable members. The current membership may be included in a view of the system.
  A naive approach to multicast involves repeatedly sending the message over one-to-one channels to each member of the group. This is neither reliable nor ordered but can be used, as explained in the lecture, as the basis for reliable and/or ordered systems. Reliability can be based on further multicasts by each recipient (pessimistic view) or on sequence numbers, acknowledgements and negative acknowledgements. Ordering is usually achieved by designating one process as sequencer. In the example algorithm I presented, based on the ISIS algorithm, the message initiator also acts as sequencer.
  Please note someone pointed out a mistake in the total-ordering example on page 6 of the handout. A corrected version of that figure can be found here.
- Lecture 13
- Suggested reading: relevant parts of Chapters 11 and 14 in CDK
Lecture 15:
- Replication In the lecture today I introduced the notion of replication which can be used to enhance the performance of a service, provide fault tolerance or achieve high availability. After introducing the general system model which presents the major concepts associated with replication, I focussed on the latter two motivations for replication. Two main approaches to fault tolerance are passive and active replication and each of these were presented. At the end of the lecture the Gossip Architecture was discussed as an example of a high availability replicated system. Here updates are propagated in a lazy manner and consistency is less important than timely response.
- Lecture 14
- Suggested reading: relevant parts of Chapter 14 in CDK
Lecture 15:
- Naming and Discovery In today's lecture I talked about name services, directory services and discovery services. In a distributed system in which we wish to share resources it is essential that we are able to clearly identify the resources we need. This is generally achieved by names, but may also be based on other attributes of the resource such as location, service characteristics etc. A naming service stores (name, attribute) pairs and allows clients to resolve a name to an attribute. We saw this in the CORBA Naming Service where a name is resolved to an object reference.
  I talked about the different navigation schemes that can be employed in a dstributed naming service and discussed DNS in some detail. DNS provides high availabilty (with weak consistency) by using partitioned data, replicated service and caching.
  A directory or discovery service allow resolution in the opposite direction in the sense that the client supplies attributes and is given a name (cf the yellow pages). The distinction is that a discovery service is operating in a dynamically changing environment. I discussed Jini as an example of a discovery service.
- Lecture 15
- Suggested reading: Chapter 9 in CDK

This page is maintained by the Course Organiser Björn Franke , JCMB, room 2414, extn. 517175.

Home : Teaching : Courses : Ds