ICL Home >> Lab Sessions >> Lab 2
 

Introduction to Computational Linguistics

Lab 2 - Python and NLTK

  1. Write a function called adder that takes two arguments and returns their sum (or concatenation). Call this function with 2 strings, 2 floats, 2 ints.
  2. Write a function called dictUnion that takes as argument two dictionaries, and returns a new dictionary that is their union. The dictionary that is returned should contain all the items in both its arguments. If the same key appears in both dictionaries, feel free to pick either value.
  3. How could you generalize this function to handle both lists and dictionaries? (Hint: look at the built-in function type).
  4. Write code to compute the square roots of each element of the list [1,4,9,16,25,36], using the function sqrt in the math module. Do this in three different ways, using a for loop, map and a list comprehension.
  5. Use nltk_lite.corpora to load and count the number of words in austen-emma, which is contained in the gutenberg corpus.
  6. As in the lecture notes, count the frequency of each word in austen-emma, and sort the words and their frequencies by frequency. Order words with the same frequency alphabetically.
  7. A bigram is a pair of contiguous words in a text. Order matters so the dog is different to dog the. Count the frequency of each bigram that occurs in austen-emma, and sort by bigram frequency.


Home : Teaching : Courses : Icl 

Informatics Forum, 10 Crichton Street, Edinburgh, EH8 9AB, Scotland, UK
Tel: +44 131 651 5661, Fax: +44 131 651 1426, E-mail: school-office@inf.ed.ac.uk
Please contact our webadmin with any comments or corrections. Logging and Cookies
Unless explicitly stated otherwise, all material is copyright © The University of Edinburgh