Welcome Back!
The goal of this lab is to see how the story about phonological
rules works in practice: to get a feel for how those rules work and
to learn about phonetic alphabets, which are used to represent the
pronounciation of words. You'll also get experience using the Unix
stream editor sed
to implement
a simulation of phonological rules using simple
string manipulation.
In this lab session you are going to be learning about:
If you don't understand any of the instructions below, if things don't seem to be working out properly, or if you have any other questions, please put up your hand and wait till someone comes round to help you!
Also, this session will be more fun and productive if you work through it with a partner!
At the end of each section, there is a link with the solutions to the exercises. You are of course free to immediately clic there and conclude the lab in 7 mins but obviously this is not the aim of the lab (and you probably didn't put so much effort into climb 5 flight of stairs just to leave without having learnt anything new). You should try your best to solve the exercises with your partner and use the results only to check if everything is fine. If you feel completely stuck and you have alredy tried to solve the problem, you can have a critical look at it and try to understand why you couldn't do it by yourself.
This shouldn't be news to you, but English words are not always spelled as they are pronounced, and vice versa. This is one of the reasons why linguists have developed phonetic alphabets, which try to directly represent the pronunciation of words.
The most famous phonetic alphabet is the International Phonetic Alphabet (IPA), developed during the late 19th century. Most of the dictionaries report the phonetic spelling for every entry. For example, open the Oxford English Dictionary and check the phonetic spelling for the following words:
As you probably have already figured out, one problem with the IPA is that it uses lots of different symbols that are not easy to represent in computer terminals, and that don't have keys on computer keyboards, for example ə, ɔ and ʃ.
Over time, different computer-friendly representations have been developed. One solution is to allow a phoneme to be represented by up to two characters, and to use spaces to separate individual phonemes. This is the approach taken by the Arpabet phonetic alphabet.
Do the following:
Note that the representation of vowel phonemes in the CMU Pronouncing Dictionary has a strong North American bias.
Do the following:
The first part of these commands, i.e. the subcommand echo "bananarama" followed by the pipe symbol |, just means something like "Take the word 'bananarama' and do the following thing to it".
The second part, i.e. the thing that is done to the word, always takes the following form in the above examples:
sed 's/.../.../g'
sed is the UNIX line editor. It performs basic editing commands on a line of text. Although basic, it is extremely powerful.
Can you figure out what sed 's/.../.../g' means? What do 's' and 'g' stand for? __________________________________________
You can assign a name to these substitutions, using the UNIX command alias:
alias substitute='sed "s/$/s/g"'
and use it as before:
echo "bananarama" | substitute
You can get a local copy of a file and give it a name, by running the following commands:
wget http://www.inf.ed.ac.uk/teaching/courses/inf1-cg/labs/lab4/HelloWorld.txt HW=HelloWorld.txt
Now when you type in the following (including the '$'):
cat $HWyou should get the content of the file HelloWorld.txt printed on the screen.
If you apply the previous substitution to $HW: what do you get?
_______________________________________________________________________________If you have already completed the task, or if you are really (really, really) stuck, have a look to the Solutions
Now look at this list of nouns
This lists the contents of a file, called 'nouns.txt'. You should see a list of 15 nouns (on separate lines) denoting kinds of animal, represented in Arpabet. Can you figure out what each animal is? Remember the phonemes are represented with their North American pronunciation!
Let's get a local copy of this file and give it a name (NOUNS)
_______________________________________________________________________________print on the screen the content of the entire file (tip: man cat)
_______________________________________________________________________________you should get the same list of animals as you saw in the browser.
Do you remember the English Morphological Rule to produce the plural form of a noun? (tip: /z/)
______________________________________________________________________________Can you imagine how to produce this substitution (woops... morphological rule) in the list of nouns in $NOUNS?
______________________________________________________________________________Give to the morphological rule a name ("plural") and apply the rule to the nouns in $NOUNS
_______________________________________________________________________________If you have already completed the task, or if you are really (really, really) stuck, have a look to the Solutions
As you noticed, the output we get from running the morphological rule in section 3 to the nouns in the file is still not perfect (try pronouncing the sequences of sounds with your mouth to see what is wrong).
You should recall from the lectures last week, that the output of the plural morphological rule in English, is itself the input to a system of phonological rules, which change some of the sounds to make then whole thing easier to articulate with the tongue.
Two phonological rules were proposed:
The first of these, is called 'anaptyxis'. How would you implement this rule (call it anaptyxis).
_______________________________________________________________________________Try to add this new rule to the plural morphological rule from the previous section (tip: remember to use the "|")
_______________________________________________________________________________In other words, first we perform the morphological rule from section 3, then we do the phonological rule on its result. Try it out yourself and see what happens. Is the output better than before?
The second phonological rule, is called 'devoicing'. Write this rule. Note that we can create more complex sed commands by combining them using the semi-colon ';' (sed "s/.../.../g; s/.../.../")
_______________________________________________________________________________Try to add this new rule to the plural morphological rule from the previous section (tip: remember to use the "|")
_______________________________________________________________________________Finally, we can combine the two phonological rules to run one after the other:
_______________________________________________________________________________Try this out. Does it work OK? What happens if you apply the two phonological rules in the wrong order?
_______________________________________________________________________________We still can't capture the fact that irregular nouns like 'goose' and 'sheep' don't accept regular plural suffixes. Can you add yet another step to the sequence to sort this out?
_______________________________________________________________________________If you have already completed the task, or if you are really (really, really) stuck, have a look to the Solutions
Now look at another list of words
You should see a list of 15 verbs, represented in Arpabet. Can you figure out what each verb is?
Let's get a local copy of this file and give it a name (VERBS)
Recall from the lectures that regular past tense inflection can be captured by positing one morphological rule and two phonological rules:
Implement this system of rules using a sed command for each rule, as you did in the previous section.
alias pasttense: _______________________________________________________________________________
alias VBanaptyxis: _______________________________________________________________________________
alias VBdevoicing: _______________________________________________________________________________
Put it all together: _______________________________________________________________________________
If you have already completed the task, or if you are really (really, really) stuck, have a look to the Solutions
Read slide 1 from lecture 8.
Implement the morphological rule which adds an 'iy' phoneme to end of a singular noun like "f ow t". Call it "oldplural".
We can now implement the "vowel harmony" phonological rule, but it's going to take a bit more work:
_______________________________________________________________________________It might help to know that:
Can you apply the vowel harmony rule to the output of the old plural rule to turn the singular "f ow t" into the Old English plural "f ey t iy"?
_______________________________________________________________________________If you have already completed the task, or if you are really (really, really) stuck, have a look to the Solutions
Consider the following pairs of words as they are pronounced in Modern English:
man - mane Sam - same rat - rate shack - shake bit - bite snip - snipe rid - ride met : mete pet : Pete
As any five year old will tell you, when you add "magic e" to the end of one of these words, the vowel sound changes: (a) from ae to ey; (b) from ih to ai; and (c) from eh to iy.
The magic e effect is the result of the Great Vowel Shift in Middle English. This resulted in lots of vowels being "raised" up in the mouth, depending on whether or not they were followed by an unstressed syllable.
So, "mane" used to be pronounced as "m ae n uh", but as a result of the GVS the final 'uh' was dropped, and the 'ae' raised to 'ey', giving the new pronunciation "m ey n".
Write a phonological rule, called 'gvs', which implements this part of the historical Great Vowel Shift. It should work as follows:
echo "m ae n uh" | gvs m ey n echo "m ae n" | gvs m ae n echo "s n ih p uh" | gvs s n ai p echo "s n ih p" | gvs s n ih p echo "m eh t uh" | gvs m iy t echo "m eh t" | gvs m eh t_______________________________________________________________________________
If you have already completed the task, or if you are really (really, really) stuck, have a look to the Solutions