The main learning outcome of this lab is to learn about evaluation techniques and significance.
If you follow the instructions in the lab sheet, the output of plot() should be:
A curve that is high on the left hand side (i.e. at low recall levels) indicates that the algorithm is suitable if high precision is required in the top search results. This is the case for typical web search, where most people do not really care for anything beyond the first 10 results. A curve that is high on the right hand side of the plot indicates that this algorithm might be a good choice if recall really matters, i.e. for exhaustive searches on a topic, often done by professionals like lawyers.
You could implement the ranking algorithm like this:
def rank(folder, rel): algorithms = os.listdir(folder) relevance = l.convertFiles(rel, False) numq = len(relevance) maps = dict() for algorithm in algorithms: algo = l.convertFiles(folder + '/' + algorithm, True) avP = l.evaluate(algo, relevance) maps[algorithm] = sum(avP) / numq return sortMap(maps)
This should output something like this:
[('algo2.top', 0.971), ('algo4.top', 0.970), ('algo6.top', 0.772), ('algo3.top', 0.768), ('algo5.top', 0.755), ('prf.top', 0.557), ('algo1.top', 0.479), ('inq.top', 0.363), ('cos.top', 0.289), ('coo.top', 0.211), ('tf.top', 0.186)]
To compare algorithms by GMAP rather than MAP, replace the line maps[algorithm] = sum(avP) / numq with maps[algorithm] = lprod(avP) ** (1.0/numq), where lprod is a recursive function which computes the product of all the items in a list:
def lprod(list): if list: return list.pop() * lprod(list) else: return 1
This should yield the values:
[('algo4.top', 0.966), ('algo2.top', 0.953), ('algo6.top', 0.762), ('algo3.top', 0.728), ('algo5.top', 0.688), ('prf.top', 0.415), ('algo1.top', 0.394), ('inq.top', 0.332), ('cos.top', 0.228), ('coo.top', 0.163), ('tf.top', 0.131)]
Following the pseudo code from the lab sheet, this is what an implementation of the sign test could look like:
def significanceTest(a1, a2, rel): algo1 = l.convertFiles(a1, True) algo2 = l.convertFiles(a2, True) relevance = l.convertFiles(rel, False) avP1 = l.evaluate(algo1, relevance) avP2 = l.evaluate(algo2, relevance) queries = len(avP1) plus = 0 minus = 0 for i in range(queries): if avP1[i] > avP2[i]: plus += 1 elif avP1[i] < avP2[i]: minus += 1 q = plus+minus x = max(plus, minus) divident = math.factorial(q) * (0.5 ** q) total = 0.0 for n in range(x, q+1): product = divident / (math.factorial(n) * math.factorial(q-n)) total += product return total
|
Informatics Forum, 10 Crichton Street, Edinburgh, EH8 9AB, Scotland, UK
Tel: +44 131 651 5661, Fax: +44 131 651 1426, E-mail: school-office@inf.ed.ac.uk Please contact our webadmin with any comments or corrections. Unless explicitly stated otherwise, all material is copyright © The University of Edinburgh |