This week’s tutorial exercises focus on dependency syntax. We haven’t forgotten about logistic regression, it will come back next week.
In this tutorial, we will explore ideas regarding dependency grammars and parsing. (For questions about the notion of free word ordering in other languages than English, see the sheet of Discussion Questions for Week 8.)
nsubj, dobj, iobj, det, case, nmod, amod
. You should be able to figure out most of the labels by looking at examples from the textbook. For prepositional phrases, use the nmod
relation, as in these examples:1Note that by convention, all dependency parses have a root, whether or not it is the head of a full sentence.
I like green eggs and ham
. You will need to use the cc
and conj
labels (see examples in JM3, Fig 15.3). Do you run across any problems? Is it clear what the dependency structure should be? Is the ambiguity in this sentence represented in the dependency structure (or multiple structures), and if so how?There are several points worth noticing/discussing, including the following:
Conjunction is inherently a symmetric relationship, but dependency grammar requires asymmetric relations. So it isn’t a natural fit, and requires choosing one of the two conjuncts arbitrarily as the head for both the conj
and cc
relations. In fact UD v1 and v2 differ on what is the head of the cc
relation!
There is only a single dependency parse for this sentence, even though there are two different meanings. So in this case (unlike constituency structure) the dependency structure does not reflect the semantic ambiguity.
You might have considered (or discussed in your group) whether there are alternative guidelines for parsing conjunctions that would reflect the ambiguity, or have a more symmetric relationship. For example, would it be reasonable to make and the head of the conjoined phrase, and would this solve any of the problems? What might a dependency-like structure look like that better captures the meaning where green modifies both eggs and ham? (It has two arcs pointing to green, which isn’t a valid dependency tree. But some people have proposed that we should really be using dependency graphs, rather than trees. These would permit this kind of structure, but are much more difficult to deal with computationally, e.g. to design efficient parsing algorithms.)
By the way, Green Eggs and Ham happens to be the title of a popular American children’s book by Dr. Seuss. But that doesn’t matter for understanding the issues in this question! I chose the phrase specifically because it’s not obvious what the “correct” interpretation of the ambiguous conjunction is (if there even is one). The main takeaway from this exercise is for you to understand some of the weaknesses of dependency structure.
Consider the following dependency-annotated sentence. (For simplicity, we leave out the relation labels in this exercise).
By hand-simulating the algorithm for arc-standard transition-based parsing, show that there is more than one sequence of transitions that can lead to the correct parse of this sentence. How does this fact motivate the need for the procedure described in JM3 section 15.4.1 (generating the training oracle)? What is the sequence produced by the training oracle?
Here are two possible sequences (you might find others). The first is the training oracle sequence, which chooses LeftArc as soon as possible in all cases.
Step | Stack | Word list | Action | Relation added |
---|---|---|---|---|
0 | [root] | [the, cat, chased, a, dog] | \(Shift\) | |
1 | [root, the] | [cat, chased, a, dog] | \(Shift\) | |
2 | [root, the, cat] | [chased, a, dog] | \(LeftArc\) | (the \(\leftarrow\) cat) |
3 | [root, cat] | [chased, a, dog] | \(Shift\) | |
4 | [root, cat, chased] | [a, dog] | \(LeftArc\) | (cat \(\leftarrow\) chased) |
5 | [root, chased] | [a, dog] | \(Shift\) | |
6 | [root, chased, a] | [dog] | \(Shift\) | |
7 | [root, chased, a, dog] | [] | \(LeftArc\) | (a \(\leftarrow\) dog) |
8 | [root, chased, dog] | [] | \(RightArc\) | (chased \(\rightarrow\) dog) |
9 | [root, chased] | [] | \(RightArc\) | (root \(\rightarrow\) chased) |
10 | [root] | [] | \(DONE\) |
Step | Stack | Word list | Action | Relation added |
---|---|---|---|---|
0 | [root] | [the, cat, chased, a, dog] | \(Shift\) | |
1 | [root, the] | [cat, chased, a, dog] | \(Shift\) | |
2 | [root, the, cat] | [chased, a, dog] | \(LeftArc\) | (the \(\leftarrow\) cat) |
3 | [root, cat] | [chased, a, dog] | \(Shift\) | |
4 | [root, cat, chased] | [a, dog] | \(Shift\) | |
5 | [root, cat, chased, a] | [dog] | \(Shift\) | |
6 | [root, cat, chased, a, dog] | [] | \(LeftArc\) | (a \(\leftarrow\) dog) |
7 | [root, cat, chased, dog] | [] | \(RightArc\) | (chased \(\rightarrow\) dog) |
8 | [root, cat, chased] | [] | \(LeftArc\) | (cat \(\leftarrow\) chased) |
9 | [root, chased] | [] | \(RightArc\) | (root \(\rightarrow\) chased) |
10 | [root] | [] | \(DONE\) |
The training oracle is needed in order to define a set of actions that will lead to a correct parse, and are also as consistent as possible. In other words, when we train the classifier to decide an action, we want the training data (sequences of configurations from the training oracle) to be as consistent as possible about what action is taken given a particular configuration or partial configuration, because consistent patterns are easier to learn than random ones.
The textbook and this tutorial more or less follow the UD v1 guidelines; UD guidelines have now been updated for v2, so if you look online you may find discrepancies.↩︎