TTS Lab 3 Answers (Evaluation & Significance)

The main learning outcome of this lab is to learn about evaluation techniques and significance.

Recall/Precision Graphs

If you follow the instructions in the lab sheet, the output of plot() should be:

Recall-precision plot

A curve that is high on the left hand side (i.e. at low recall levels) indicates that the algorithm is suitable if high precision is required in the top search results. This is the case for typical web search, where most people do not really care for anything beyond the first 10 results. A curve that is high on the right hand side of the plot indicates that this algorithm might be a good choice if recall really matters, i.e. for exhaustive searches on a topic, often done by professionals like lawyers.

Mean Average Precision

You could implement the ranking algorithm like this:


def rank(folder, rel):
	algorithms = os.listdir(folder)
	relevance = l.convertFiles(rel, False)
	numq = len(relevance)
	maps = dict()
	for algorithm in algorithms:
		algo = l.convertFiles(folder + '/' + algorithm, True)
		avP = l.evaluate(algo, relevance)
		maps[algorithm] = sum(avP) / numq

	return sortMap(maps)

This should output something like this:

[('algo2.top', 0.971), ('algo4.top', 0.970), ('algo6.top', 0.772), ('algo3.top', 0.768), ('algo5.top', 0.755), ('prf.top', 0.557), ('algo1.top', 0.479), ('inq.top', 0.363), ('cos.top', 0.289), ('coo.top', 0.211), ('tf.top', 0.186)]

To compare algorithms by GMAP rather than MAP, replace the line maps[algorithm] = sum(avP) / numq with maps[algorithm] = lprod(avP) ** (1.0/numq), where lprod is a recursive function which computes the product of all the items in a list:


def lprod(list):
	if list:
		return list.pop() * lprod(list)
	else:
		return 1

This should yield the values:

[('algo4.top', 0.966), ('algo2.top', 0.953), ('algo6.top', 0.762), ('algo3.top', 0.728), ('algo5.top', 0.688), ('prf.top', 0.415), ('algo1.top', 0.394), ('inq.top', 0.332), ('cos.top', 0.228), ('coo.top', 0.163), ('tf.top', 0.131)]

Significance

Following the pseudo code from the lab sheet, this is what an implementation of the sign test could look like:


def significanceTest(a1, a2, rel):
	algo1 = l.convertFiles(a1, True)
	algo2 = l.convertFiles(a2, True)
	relevance = l.convertFiles(rel, False)

	avP1 = l.evaluate(algo1, relevance)
	avP2 = l.evaluate(algo2, relevance)

	queries = len(avP1)

	plus = 0
	minus = 0

	for i in range(queries):
		if avP1[i] > avP2[i]:
			plus += 1
		elif avP1[i] < avP2[i]:
			minus += 1

	q = plus+minus
	x = max(plus, minus)
	divident = math.factorial(q) * (0.5 ** q)
	total = 0.0
	for n in range(x, q+1):
		product = divident / (math.factorial(n) * math.factorial(q-n))
		total += product
	
	return total


Home : Teaching : Courses : Tts 

Informatics Forum, 10 Crichton Street, Edinburgh, EH8 9AB, Scotland, UK
Tel: +44 131 651 5661, Fax: +44 131 651 1426, E-mail: school-office@inf.ed.ac.uk
Please contact our webadmin with any comments or corrections.
Unless explicitly stated otherwise, all material is copyright © The University of Edinburgh