CS/SE Individual Practical (08-09) Part 1: Familiarisation with Chord and DHash

Introduction

The Chord protocol provides a mapping from a set of identifiers to a set of nodes. The identifiers are 160 bit words in the original Chord protocol. Typically, they are the produced by secure hash functions from perhaps file names or the contents of files. Each node is typically a computational and storage resource accessible via a port at some Internet address. The protocol is accessible from any node and distributes its computational and storage requirements amongst the nodes. There is no centralised server managing the mapping. This gives the protocol, and applications based on it, significant scalability.

A common layer built on top of the protocol is a distributed hash table (DHash). The Chord protocol mapping provides one stage in the mapping of, for example, a filename to a location on some node where that file is stored.

Java implementations of the Chord protocol and a distributed hash table are provided as the starting point for your work on this part of IP.

The Part 1 tasks involve making minor extensions to the provided Chord protocol and DHash implementations. These extensions will help you develop your understanding of the provided code base.

Before starting work, please read fully these instructions and the auxiliary web pages on the Chord protocol and Java RMI. These auxiliary pages are linked to in the next section.

Please use the newsgroup and weekly meetings to ask for clarifications on any of the instructions here and on the supplied code.

Background Reading

System Architecture

It is essential that you read and understand this summary of the Chord protocol. You might well find it useful to have a look at the paper
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications, Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, and Hari Balakrishnan, ACM SIGCOMM 2001, San Deigo, CA, August 2001, pp. 149-160.
Focus on Sections 1-4 of this paper which deal with the non-concurrent version of the Chord protocol that is used in this assignment.

The Chord protocol itself is only concerned with mapping Chord identifiers to nodes in the peer to peer system. It is not concerned with data storage at the nodes. The paper

Wide-area cooperative storage with CFS, Frank Dabek, M. Frans Kaashoek, David Karger, Robert Morris, and Ion Stoica, ACM SOSP 2001, Banff, October 2001.
discusses the implementation of a distributed hash table and a distributed file system on top of the Chord protocol. The supplied code for the Chord protocol and hash table and the filesystem you are to design follow the basic architecture described in this paper, but do not include most of the more sophisticated features that this paper covers. Both papers are available from the Chord Project home page.

Java RMI

This assignment makes extensive use of the RMI (Remote Method Invocation) capability that comes packaged with the Standard Edition of Java. This RMI resources page should be consulted to learn more.

The provided Chord protocol implementation

The implementation classes and interfaces are: In addition, several classes and interfaces are provided for testing the implementation in both single and multiple JVM contexts. These are: We call this test architecture a master-slave architecture because a CNTest master controls one or more CNSlaveImp slaves.

The file pt1/README.txt describes the locations of the Java files and instructions for compiling and running them.

The provided distributed hash table implementation

A layered architecture is adopted where the distributed Chord protocol implementation is largely hidden from users of the hash table.

The main classes and interfaces are:

A master-slave test architecture is also adopted for testing rings of DHashNodeImp objects. This is similar to that used for testing rings of ChordNodeImp objects, though there are slight differences in how the master and slave processes are started up. The relevant classes and interfaces are:

Additional classes and interfaces include:

Again, the file pt1/README.txt describes the locations of the Java files and instructions for compiling and running them.

Tasks

Task A: Reporting Chord and DHash configurations

Extend the provided DHNTest with a static method
    private static void printRing(DHNSlave d) throws RemoteException {}
that prints out the current status of the Chord and DHash rings, including for each node: For example, after the join of node 15 in test1() provided in DHNTest, it should print something like:
[B]   7   ( 1') :  9   9  15' 15'   [ka( 2) va] 
[B]   9   ( 7 ) : 15' 15' 15'  1'   [kb( 9) vb] 
[A]  15   ( 9') :  1   1   7'  7'   [kc(14) vc] 
[A]   1   (15 ) :  7'  7'  7'  9'   
Here, there's one line for each node in sequence round the ring from the d argument starting point. The line for a node lists the domain, predecessor and fingers of the Chord node, and the key value pairs stored at the corresponding DHash node. The IDs corresponding to key names are shown after the key names to make it easy to see whether each key-value pair has been stored on the correct node. The forward quote(') characters after the IDs are used to indicate that the predecessor or finger is referring to a node in a remote process.

Assume that printRing() is only used with IDs of up to say 9 bits in size, and that the key value data is brief, so information on each node can fit comfortably onto a single line. Display IDs in decimal.

Add printRing() calls to test1() after each change in ring configuration, and add println() calls so that the operations of the test are intellible from the print output. For example, part of the output might look like:

[B]   7   ( 1') :  9   9   1'  1'   [ka( 2) va] 
[B]   9   ( 7 ) :  1'  1'  1'  1'   [kb( 9) vb] 
[A]   1   ( 9') :  7'  7'  7'  9'   [kc(14) vc] 

Joining new node 15 to ring

[B]   7   ( 1') :  9   9  15' 15'   [ka( 2) va] 
[B]   9   ( 7 ) : 15' 15' 15'  1'   [kb( 9) vb] 
[A]  15   ( 9') :  1   1   7'  7'   [kc(14) vc] 
[A]   1   (15 ) :  7'  7'  7'  9'   

Adding key-value pairs d,e

[B]   7   ( 1') :  9   9  15' 15'   [ke( 3) ve] [ka( 2) va] 
[B]   9   ( 7 ) : 15' 15' 15'  1'   [kb( 9) vb] 
[A]  15   ( 9') :  1   1   7'  7'   [kd(13) vd] [kc(14) vc] 
[A]   1   (15 ) :  7'  7'  7'  9'   
You should convince yourself that these configurations are as you might expect.

You might find the format method of the java.util.Formatter class useful for printing to fixed-width fields.

Task B: Tracing behaviour of findSuccessor()

Look in the ChordNodeImp class at the findSuccessor() method implementation. The code for this method and the main methods it calls includes invocations of logging methods to enable tracing of the search for a node responsible for an id.

The logging code makes use of the java.util.logging package. The Java API documentation page for this package decribes the overall organisation of the Java logging classes and links to further documentation on the package.

Logging is configured so that log records created in logging invocations in the ChordNodeImp class make their way to the publish() method of the supplied CNHandler class. This formats records as strings and passes these strings, possibly using remote method invocations, to the publish() method of the MonitorImp class, which does some further formatting and prints the log messages. The point of this organisation is to pull together and suitably interleave logging messages that originate in the possibly multiple Java processes involved in the findSuccessor() execution. In a multiple process test, the findSuccessor() and CNHandler publish() methods all execute in the test slave processes, whereas the MonitorImp publish() method executes in the test master process, printing to the standard output of the master process.

Task B involves understanding the supplied code and completing the code in the provided CNHandler and MonitorImp classes in order to generate logs of form:

[A] ENTERING n1.findSuccessor(5)
[A] Calling n1.findStrictPredecessor(5)
[A]    ENTERING n1.findStrictPredecessor(5)
[A]    RETURNING n1
[A] Receiving n1
[A] RETURNING n7'

[B] ENTERING n9.findSuccessor(8)
[B] Calling n9.findStrictPredecessor(8)
[B]    ENTERING n9.findStrictPredecessor(8)
[B]    Calling n9.closestStrictlyPrecedingFinger(8)
[B]       ENTERING n9.closestStrictlyPrecedingFinger(8)
[B]       stopped at finger 4 
[B]       RETURNING n1'
[B]    Receiving n1'
[B]    Calling n1'.findStrictPredecessor(8)
[A]       ENTERING n1.findStrictPredecessor(8)
[A]       Calling n1.closestStrictlyPrecedingFinger(8)
[A]          ENTERING n1.closestStrictlyPrecedingFinger(8)
[A]          stopped at finger 3 
[A]          RETURNING n7'
[A]       Receiving n7'
[A]       Calling n7'.findStrictPredecessor(8)
[B]          ENTERING n7.findStrictPredecessor(8)
[B]          RETURNING n7
[A]       Receiving n7'
[A]       RETURNING n7'
[B]    Receiving n7
[B]    RETURNING n7
[B] Receiving n7
[B] RETURNING n9
Note there are five kinds of messages appearing here. The code added should be set up to handle each kind of message, and should not itself be customised just for the logging resulting from findSuccessor() execution.

The test1() method in the supplied DHNTest class includes code to generate the log records for the above examples. Comments in the test1() code indicate code to be uncommented to enable log record generation. As a first step, try enabling record generation and figure out how the provided code in the CNHandler and MonitorImp classes is generating the observed output.

The logging invocations in the ChordNodeImp class should not be touched when working on this task.

Task C: Checking that configurations are well formed

Extend the provided DHNTest with a static method
    private static void checkRing(DHNSlave d) throws RemoteException {}
that checks the ring pointed to by d is well formed. In particular it should check: Each property check should be done using a call to an appropriate static method of JUnit's Assert class. Use the versions of the methods that allow for brief messages to be inserted explaining what is being checked, and provide appropriate messages. The checks should not need to make use of methods such as findSuccessor() that do search in the ring. Indeed, use of such methods is unwise, given that they rely in the first place on the ring being properly configured.

Include a call to checkRing() everywhere you have a call to printRing() in your test1() method.

How to organise your work

Assessment

Part 1 is worth 25% of the overall IP mark. The Part 1 mark will be based on 3 criteria: Marks for these criteria will be weighted 40%/40%/20% respectively.

Getting Started

Submission of Solutions

Each file you submit (whether a report file, Java file or some other kind) should include the following information on the first page.

Your programs should compile and run on DICE machines with unmodified versions of all supplied source files, except those explicitly indicated above.

Please generate log files from runs of the main JUnit test, test1() in the DHNTest class. Provide log files part1-S.log and part1-A.log in the pt1 directory for tests in single-process (S) and multi-process automatic-startup (A) modes respectively. For example, use the command

java dhash.DHNTest A &> part1-A.log
to generate the A mode log, capturing both standard output and standard error streams.

Only submit the files you have created or modified.

Submission check list

Before submitting anything please check that you have followed all the instructions above. Failure to submit any of the necessary files or to include instructions on how to compile and run your program may result in marks being deducted. In particular, please ensure that:

Submission

Save this Part 1 own work declaration form to file
  pt1/own-work-decl-ip-pt1.txt
and complete it.

If necessary, edit the file pt1/soln-files.lis which lists the files to be included in your submission. The supplied version of this file lists the files:

test/dhash/MonitorImp.java
test/dhash/CNHandler.java
test/dhash/DHNTest.java
run-pt1.txt
part1-S.log
part1-A.log
own-work-decl-ip-pt1.txt

With your current directory set to pt1, tar and gzip your files using a command similar to:

  tar cvzf pt1-soln.tgz -T soln-files.lis

Submit your tar file using the submit command:

   
  submit cs3 ip 1 pt1-soln.tgz

Revisions


Paul Jackson, Paul.Jackson (@) ed.ac.uk

Last modified: Mon Oct 27 15:47:18 GMT 2008


Home : Teaching : Courses : Ip 

Informatics Forum, 10 Crichton Street, Edinburgh, EH8 9AB, Scotland, UK
Tel: +44 131 651 5661, Fax: +44 131 651 1426, E-mail: school-office@inf.ed.ac.uk
Please contact our webadmin with any comments or corrections. Logging and Cookies
Unless explicitly stated otherwise, all material is copyright © The University of Edinburgh