I then went through some architectural models which are intended to give you an idea of the different ways in which distributed systems can be organised. Although client/server and its variants still dominates as the common architecture for distributed systems, other schemas such as peer-to-peer are starting to appear. During the review of these architectures I again highlighted topics which we will return to later in the course.
At the end fo the lecture I gave a brief summary of the networking concepts underlying message passing in distributed computations. The effect of message passing is to introduce a delay, which is equal to the latency and the transmission delay. In general the more complex (and disparate) the topology of the route the message has to take, the greater the delay. I explained some of the devices and software components that a message may have to pass through, each of which will introduce an associated delay. Thus the latency within a single ethernet is typically less than 1 millisecond, whereas the latency between two arbitrary hosts on the Internet will be 300-600 milliseconds.
I talked about the IP protocol, IP addresses and domain names which are now almost universally used (the exceptions being WAP and specialised multimedia streaming protocols). I also discussed IP version 6 (IPv6) and MobileIP which offers support for mobile hosts.
I also discussed data marshalling and multicast. These features are often managed by middleware, rather than applications, as we will see in the next lecture. Nevertheless is it important to be aware of them.
The interface, which specifies the procedures/methods of the remote service is used to generate these supporting elements, as well as providing the means for agreement between the client and server implementors on the acceptable forms of interaction.
The development process I outlined today was for static invocation - I will (briefly) discuss dynamic invocation next week. At the end of the lecture I ran out of time for explaining the POA (portable object adaptor) but I'll cover this next time.
I also talked about DCOM's reference counting protocol which aims to address the problem of knowing when an object instance may be regarded as suitable for garbage collection. Finally I discussed more recent moves towards XML Web Services.
For some applications definite timing is not important but rather relative timing and indications of causality. The happened-before relation captures quite an intuitive notion of causality with respect to local events and message sending and receiving. Logical clocks count the events on processes, each process adjusting the count on the basis of messages from other processes when a receive event occurs. Some problems of incomparability of events based on logical clocks are solved when vector logical clocks are used instead.
For any arbitrary process we can record a history as a sequence of local events and from these we can form a global history as the union of local histories. To view states at particular times we need to truncate the histories appropriately - form a cut. A cut is termed consistent if it respects the happened-before relation in the sense that every event that happened-before an event in the cut is also in the cut. Once a global state is defined we can go on to consider runs, linearization and properties on them.
In the latter half of the lecture I explained one algorithm for finding global states - Chandy and Lamport's algorithm for global snapshots. This algorithm does not use logical clocks and has the advantage that normal processing can continue whilst the snapshot is in progress. The algorithm makes use of marker messages which trigger processes to record their local state and separate messages within a channel into those before the cut and those after. The second algorithm was Marzullo and Neiger's algorithm for post hoc analysis of executions, such as used in distributed debugging. This does use vector clocks, to reconstruct globally consistent states for recorded histories from each process.
In the presentation of the algorithms we assumed that all messages were delivered in the correct order without duplications, corruptions or losses. We also assumed that all processes were reliable. At the end of the lecture I discussed the degree of fault tolerance supported by each algorithm.
In today's lecture I talked about reliable and unreliable failure detectors, which give judgements of unsuspected and failed, or unsuspected and suspected respectively. Assuming that failures can be detected I then talked about election algorithms, going through two algorithms: the ring-based algorithm and the more robust bully algorithm in some detail. At the end of the lecture I discussed the alternative approach of group formation, as exemplified by the invitation algorithm.
I discussed several fault tolerance mechanisms, primarily aimed at process omission failures. In particular I discussed the use of recoverability in transactions. Here a set of primitive operations on a distributed entity are treated as atomic, and must be either all committed or all aborted. Use of permanent storage and a commit protocol allow a group of servers, under the guidance of a coordinator, to agree to all take the action appropriate for the group.
Finally I discussed consensus and related problems, in which we aim to deal with processes making arbitrary failures, i.e. continuing to operate but sending incorrect values. I explained that the problem has no solution which is guaranteed to have the termination, agreement and integrity properties in an asynchronous system. Even in a synchronous system the number of faulty processes f in a set of N processes must be such that N > 3f. Furthermore solution relies on f+1 rounds of messages. This can be improved on if signed messages are used.
In order to achieve a multicast we must first know who is in the group and then send a message to each of those processes. Group membership is generally managed by a group membership service, which as well as allowing processes to join and leave, will generally include a failure detector to exclude unreachable members. The current membership may be included in a view of the system.
A naive approach to multicast involves repeatedly sending the message over one-to-one channels to each member of the group. This is neither reliable nor ordered but can be used, as explained in the lecture, as the basis for reliable and/or ordered systems. Reliability can be based on further multicasts by each recipient (pessimistic view) or on sequence numbers, acknowledgements and negative acknowledgements. Ordering is usually achieved by designating one process as sequencer. In the example algorithm I presented, based on the ISIS algorithm, the message initiator also acts as sequencer.
Please note someone pointed out a mistake in the total-ordering example on page 6 of the handout. A corrected version of that figure can be found here.
I talked about the different navigation schemes that can be employed in a dstributed naming service and discussed DNS in some detail. DNS provides high availabilty (with weak consistency) by using partitioned data, replicated service and caching.
A directory or discovery service allow resolution in the opposite direction in the sense that the client supplies attributes and is given a name (cf the yellow pages). The distinction is that a discovery service is operating in a dynamically changing environment. I discussed Jini as an example of a discovery service.
Informatics Forum, 10 Crichton Street, Edinburgh, EH8 9AB, Scotland, UK
Tel: +44 131 651 5661, Fax: +44 131 651 1426, E-mail: school-office@inf.ed.ac.uk Please contact our webadmin with any comments or corrections. Logging and Cookies Unless explicitly stated otherwise, all material is copyright © The University of Edinburgh |