ANLP 2016
Lecture 24: (Co-)reference

Henry S. Thompson
With input from Johanna Moore and, indirectly, Bonnie Webber and Adrian Brasoveanu
10 November 2016
Creative CommonsAttributionShare Alike

1. Introducing coreference

Consider the following narrative:

Violent strikes rocked Happyland. A spokesman for the country's Department of Peace said they would meet with the strikers tomorrow. Another spokesman said that this was intended to demonstrate the country's commitment to resolving the dispute.

It contains something like 12 noun phrases

So there must be some shared referents

2. Some terminology

Referring expression
A part of an utterance used to identify or introduce an entity
  • in(to) a discourse (in real language use)
  • in(to) a discourse model (in a theory or implementation)
Referents
are such entities
  • in a model
  • or (imagined to be) in the world
Reference
is the relation between a referring expression and a referent
Coreference
When more than one referring expression is used to refer to the same entity
Anaphora
Reference to, or depending on, a previously introduced entity

3. Definite reference

Violent strikes rocked Happyland. A spokesman for the country's Department of Peace said they would meet with the strikers tomorrow. Another spokesman said that this was intended to demonstrate the country's commitment to resolving the dispute.

NP's in red above are definite referring expressions

Their use presupposes the existence of a unique (and uniquely identifiable) referent

Referring expressions can be embedded in other referring expressions

4. Anaphora

Violent strikes rocked Happyland. A spokesman for the country's Department of Peace said they would meet with the strikers tomorrow. Another spokesman said that this was intended to demonstrate the country's commitment to resolving the dispute.

The country (twice), they, the strikers, another spokesman, this, the dispute are anaphoric expressions

5. Indefinite Reference

Indefinite NPs usually introduce new referents to the discourse:

A spokesman for the Department of Peace said he would meet with the strikers.
Another spokesman said that this . . .

When an indefinite NP is in the scope of propositional attitude verbs

Willard wants a sloop

6. Coreference

Referring expressions with the same referent are said to corefer.

Indefinite NPs can set up referents for subsequent coreferential anaphoric expressions:

There are fairiesi in my garden. The fairiesi/Theyi are having a ball

(Here and below we use subscripts to denote distinct referents)

Not all fairies - just the ones in my garden

Not all subsequent anaphoric expressions corefer with their antecendent:

There are fairiesi in my gardenm. Other fairiesj live elsewherek

"other fairies" ≡ fairies other than fairiesi

"elsewhere" ≡ places other than my gardenm

7. Coreference and Pronouns

Pronouns serve as anaphoric expressions when they rely on the previous discourse for their interpretation

Definite pronouns
He, she, it, they etc.
Indefinite pronouns
One, some, elsewhere, others etc.
  • As in "Some survived the fall, but one broke"

Some pronouns have other roles as well:

And some determiners have an anaphoric role:

The expression from the previous discourse used in interpreting a pronoun used anaphorically is called its antecedent.

A definite pronoun corefers with its antecedent.

The antecedent of an indefinite pronoun contributes in a more oblique way.

8. Reference resolution

Reference resolution is the process of determining the referent of a referring expression

Context obviously plays a crucial role in reference resolution

Situational
The real-world surroundings (physical and temporal) for the discourse
Mental
The knowledge/beliefs of the participants
Discourse
What has been communicated so far

9. Discourse context—discourse model

For people we assume

a discourse model

To produce and interpret referring expressions, a system must have methods for

In other words, for each referring expression, it must be able to determine when to

10. Implementing reference resolution

Most approaches to implementing reference resolution distinguish two stages:

  1. Filter the set of possible referents by appeal to linguistic constraints
  2. Rank the resulting candidates based on some set of heuristics

11. Constraints on pronouns: Feature agreement

English pronouns agree with the number and/or gender of the referent of their antecedent.

  • Robin has a new car. It/*She/*They is red
  • Robin has a sister. *It/She/*They/*We is well-read
  • Robin has three cars. *It/*She/They/*We are all red

As well as the person (but case is determined locally):

  • Robin and I/*me were late. *Me/*They/We/I missed the show
  • Robin and I/*me were late. The usher wouldn't let *we/*I/us/me in

French pronouns agree with the number and gender of the form of their antecedent

  • Voici une pomme. Je me demande si elle/*il/*elles est mûre [feminine form]
  • Here's an apple. I wonder if *she/it/*they is ripe [inanimate/neuter referent]

12. Constraints on pronouns: Syntax

All anaphors, including pronouns, rely on the previous text for all or part of their interpretation.

When the text is in the same sentence, pronominal coreference is subject to binding conditions

And, sometimes, to selectional restrictions based on the verb that governs it

John parked his car in the garage. He had driven it around for hours

I picked up the book and sat in a chair. It broke

We will see how automated approaches to anaphora resolution exploit such constraints

13. Constraints aren't enough

The kind of strong constraints we've just seen are not always enough to reduce the candidate set for resolution to a single entity

14. Heuristics for pronoun interpretation

Many different features influence how a listener will resolve a definite pronoun (i.e., what they will take to be its antecedent):

Recency
The most recently introduced entity is a better candidate
  • First Robin bought a phone, and then a tablet. Kim is always borrowing it
Grammatical role
- some grammatical roles (e.g. SUBJECT) are felt to be more salient than others (e.g., OBJECT)
  • Bill went to the pub with John. He bought the first round
  • "John" is more recent, but "Bill" is more salient.

15. More heuristics

Repeated mention
A repeatedly-mentioned entity is likely to be mentioned again
  • John needed portable web access for his new job. He decided he wanted something classy. Bill went to the Apple store with him. He bought an iPad.
  • "Bill" is the previous subject, but "John"'s repeated mentions tips the balance.
Parallelism
Parallel syntactic constructs can create an expectation of coreference in parallel positions
  • Susan went with Alice to the cinema. Carol went with her to the pub

16. Heuristics, concluded

Verb semantics
A verb may serve to foreground one of its argument positions for subsequent reference because of its semantics
  • John criticised Bill after he broke his promise
  • John criticised Bill after he broke his promise
  • vs.
  • John telephoned Bill after he broke his promise
  • John telephoned Bill after he broke his promise
  • Louise apologised to Sandra for her response
  • Louise apologised to Sandra for her response
  • vs.
  • Louise praised Sandra for her response
  • Louise praised Sandra for her response
World knowledge
At the end of the day, sometimes only one reading makes sense:
  • The city council denied the demonstrators a permit because they feared violence
  • The city council denied the demonstrators a permit because they feared violence
  • vs.
  • The city council denied the demonstrators a permit because they advocated violence
  • The city council denied the demonstrators a permit because they advocated violence

17. Coreference: A more general case

Anaphoric pronoun resolution is a specific instance of the more general problem of coreference resolution

Some of the heuristics enumerated above are relevant for the general case

But other relations also come in to play

Some are semantic

Hyponomy
That is, subclassing
  • Megabuck PLC announces its 3rd quarter results today. The company is expected . . .
Meronymy
That is, part-whole relations
  • I had to take my car into the shop today. The brakes were squeeking really badly

And some more superficial

18. Automatic methods

There is a rich history of automatic definite reference and pronoun resolution systems

One particular type of approach fits well with what we've already looked at in this course

The problem is viewed as a simple binomial classification task

As indicated above, different features are appropriate, or at least will be differently weighted, for the general coreference case and the more specific pronominal case

19. Using a gold standard

Given an corpus annotated for coreference, to train a model we simply

Tabulate the value of likely candidate features

Use logistic regression (see Lecture 20) to train weights for the positive and negative cases.

Note: Following J&M, the subscripts simply index the position of NPs

20. Using the model

Compute feature vectors for all possible referring expressions

For pronominal anaphora, we can just choose the most-positively scored (or largest positive vs. negative difference) antecedant

For definite referring expressions, choosing among available candidates versus not-in-the-discourse is a bit trickier

21. Conclusion

There are usable "Off the shelf" coreference resolvers for English

There is still room for improvement in both coreference and anaphor resolution methods.

Knowing what expressions corefer and how other expressions relate to their antecedents can improve performance of Language Technology systems.