Consider the following narrative:
Violent strikes rocked Happyland. A spokesman for the country's Department of Peace said they would meet with the strikers tomorrow. Another spokesman said that this was intended to demonstrate the country's commitment to resolving the dispute.
It contains something like 12 noun phrases
So there must be some shared referents
Violent strikes rocked Happyland. A spokesman for the country's Department of Peace said they would meet with the strikers tomorrow. Another spokesman said that this was intended to demonstrate the country's commitment to resolving the dispute.
NP's in red above are definite referring expressions
Their use presupposes the existence of a unique (and uniquely identifiable) referent
Referring expressions can be embedded in other referring expressions
Violent strikes rocked Happyland. A spokesman for the country's Department of Peace said they would meet with the strikers tomorrow. Another spokesman said that this was intended to demonstrate the country's commitment to resolving the dispute.
The country (twice), they, the strikers, another spokesman, this, the dispute are anaphoric expressions
Indefinite NPs usually introduce new referents to the discourse:
A spokesman for the Department of Peace said he would meet with the strikers.
Another spokesman said that this . . .
When an indefinite NP is in the scope of propositional attitude verbs
Willard wants a sloop
Referring expressions with the same referent are said to corefer.
Indefinite NPs can set up referents for subsequent coreferential anaphoric expressions:
There are fairiesi in my garden. The fairiesi/Theyi are having a ball
(Here and below we use subscripts to denote distinct referents)
Not all fairies - just the ones in my garden
Not all subsequent anaphoric expressions corefer with their antecendent:
There are fairiesi in my gardenm. Other fairiesj live elsewherek
"other fairies" ≡ fairies other than fairiesi
"elsewhere" ≡ places other than my gardenm
Pronouns serve as anaphoric expressions when they rely on the previous discourse for their interpretation
Some pronouns have other roles as well:
And some determiners have an anaphoric role:
The expression from the previous discourse used in interpreting a pronoun used anaphorically is called its antecedent.
A definite pronoun corefers with its antecedent.
The antecedent of an indefinite pronoun contributes in a more oblique way.
Reference resolution is the process of determining the referent of a referring expression
Context obviously plays a crucial role in reference resolution
For people we assume
a discourse model
To produce and interpret referring expressions, a system must have methods for
In other words, for each referring expression, it must be able to determine when to
Most approaches to implementing reference resolution distinguish two stages:
English pronouns agree with the number and/or gender of the referent of their antecedent.
- Robin has a new car. It/*She/*They is red
- Robin has a sister. *It/She/*They/*We is well-read
- Robin has three cars. *It/*She/They/*We are all red
As well as the person (but case is determined locally):
- Robin and I/*me were late. *Me/*They/We/I missed the show
- Robin and I/*me were late. The usher wouldn't let *we/*I/us/me in
French pronouns agree with the number and gender of the form of their antecedent
- Voici une pomme. Je me demande si elle/*il/*elles est mûre [feminine form]
- Here's an apple. I wonder if *she/it/*they is ripe [inanimate/neuter referent]
All anaphors, including pronouns, rely on the previous text for all or part of their interpretation.
When the text is in the same sentence, pronominal coreference is subject to binding conditions
And, sometimes, to selectional restrictions based on the verb that governs it
John parked his car in the garage. He had driven it around for hours
I picked up the book and sat in a chair. It broke
We will see how automated approaches to anaphora resolution exploit such constraints
The kind of strong constraints we've just seen are not always enough to reduce the candidate set for resolution to a single entity
Many different features influence how a listener will resolve a definite pronoun (i.e., what they will take to be its antecedent):
John needed portable web access for his new job. He decided he wanted something classy. Bill went to the Apple store with him. He bought an iPad.
John criticised Bill after he broke his promise
John criticised Bill after he broke his promise
John telephoned Bill after he broke his promise
John telephoned Bill after he broke his promise
Louise apologised to Sandra for her response
Louise apologised to Sandra for her response
Louise praised Sandra for her response
Louise praised Sandra for her response
The city council denied the demonstrators a permit because they feared violence
The city council denied the demonstrators a permit because they feared violence
The city council denied the demonstrators a permit because they advocated violence
The city council denied the demonstrators a permit because they advocated violence
Anaphoric pronoun resolution is a specific instance of the more general problem of coreference resolution
Some of the heuristics enumerated above are relevant for the general case
But other relations also come in to play
Some are semantic
And some more superficial
There is a rich history of automatic definite reference and pronoun resolution systems
One particular type of approach fits well with what we've already looked at in this course
The problem is viewed as a simple binomial classification task
As indicated above, different features are appropriate, or at least will be differently weighted, for the general coreference case and the more specific pronominal case
Given an corpus annotated for coreference, to train a model we simply
Tabulate the value of likely candidate features
Use logistic regression (see Lecture 20) to train weights for the positive and negative cases.
Note: Following J&M, the subscripts simply index the position of NPs
Compute feature vectors for all possible referring expressions
For pronominal anaphora, we can just choose the most-positively scored (or largest positive vs. negative difference) antecedant
For definite referring expressions, choosing among available candidates versus not-in-the-discourse is a bit trickier
There are usable "Off the shelf" coreference resolvers for English
http://bllip.cs.brown.edu/papers/ec-eacl09.pdf
There is still room for improvement in both coreference and anaphor resolution methods.
Knowing what expressions corefer and how other expressions relate to their antecedents can improve performance of Language Technology systems.