Inf1 CogSci 2012: Lecture 19: Artificial neural networks, child development and the 'U-shaped curve'
Mark McConville
Henry S. Thompson
6 March 2012
1. Parallel distributed processing
Starting in the 1980s, Rumelhart and McClelland promoted the multilayer
feed-forward perceptron
- as the basis an architecture for cognitive modelling
- under the name Parallel Distributed Processing
- or PDP for short
One of their early targets was. . .
- Regular and irregular past tenses in English!
And they didn't even use hidden units
2. McClelland and Rumelhart and English past tense verbs
Their model was pretty radical
- no lexicon of words
- no rules
- Just a two-level fully-connected feed-forward percpetron network
The input to the network is the verb's base form, e.g. [dans],
[sInk]
The output from the network should be the past tense form, i.e.
[danst], [sank]
![[no description, sorry]](../17/mNr.jpg)
3. McClelland and Rumelhart: the details
The design of the input and output was crucial
M&R followed Chomsky and Halle, and used features
- Triples of sets of features, in fact
- Known as Wickelfeatures
- See Words and Rules for a discussion of where these
came from
- So, covering three phonemes in a row
- For example
St-Hv-St
- or
N-St-]
- For nasal+stop+word-final
Both input and output were labelled with 460 different such triples
- 211600 connections (and weights)
![[no description, sorry]](../17/mNr2.jpg)
They trained it with 420 input/output pairs
4. And the result was. . .
After the 84,000 training
iterations, the settled network
worked well for almost all the 420
verbs in the training set
It performed adequately for a
separate test set of 86 other
verbs
- 3/4 of regular verb stems were
assigned the correct past
tense form
- Most irregular verbs stems
were assigned
overgeneralised regular past
tense forms (e.g. "digged",
"catched")
During training, the network
behaved in an impressively childlike
manner
- After a period of outputting
"gave" correctly, it shifted to
the incorrect "gived"
- It learned to be reluctant to
stick -ed on a verb stem
ending in [t] or [d]
- It made lots of characteristic
childlike errors, e.g.
cling/clang, sip/sept
- In other words, M&R appears to
capture stem-stem similarity
More after we recapitulate the human data
5. Overregularisations
Young children make errors with
past tense forms.
In particular, they overregularise
Children appear not only to learn rules, but also
to overapply them.
However, the patterns of error that children make with past
tense forms is actually highly complex
- the appearance of simplicity is deceptive
- any model that attempts to give an explanation needs to
focus on the details.
We now shift from static language description to development:
- how do children
learn regular and irregular past tense verb forms
6. Stages of language acquisition
Around 18 months, children start to produce
two-word microsentences
- e.g. "See baby!", "More cereal!"
Some of these are original productions, rather
than telegraphic renditions of their parents' speech, e.g.
- "Allgone sticky!" (i.e. "My hands are clean")
- "Circle toast!" (i.e. "I want a bagel")
Around 2 years, children start producing longer, more
complicated sentences.
Simultaneously, they start to use grammatical morphemes
- inflectional suffixes, e.g. -ed, -s, -ing
- auxiliary verbs, e.g. have, be, do, will
7. Stages of language acquisition (ctd.)
Around 3 years, children start to make errors, by attaching
the past tense suffix -ed to irregular verb stems (e.g. "singed",
"bleeded").
Around the same time, they start to pass the "wug" test, in
experimental settings:
- e.g. rick - ricked, bing - binged
Children don't just make errors adding -ed to irregular verb
stems
- they attach it to irregular past tense forms, e.g. "broked", "ated"
- they attach it to their own neologisms, e.g. "pooked",
"lightninged", "spidered"
- they attach it to past tense forms that already have a
suffix, e.g. "pukeded", "presseded"
8. Overzealous grammarians
Children don't just overgeneralise from regular past tense
forms
- they overuse the plural suffix -s, e.g. "mans", "foots",
"tooths", "mouses"
- they overuse the regular third person singulr suffix -s,
e.g. "haves", "do's", "be's"
- they overuse the comparative and superlative suffixes -er
and -est, e.g. "specialer", "powerfullest", "gooder"
- they overuse the regular ordinal suffix -th on numerals,
e.g. "oneth", "twoth", "threeth"
Children are overzealous grammarians, finding
regularity in the oddest places
- Parent: No booze in the house!
- Child: What's a "boo"?
- Kid: "I don't care who the Yankees are going to verse!"
- [From "the Yankees versus the Red Sox"]
Note that adults occasionally do the same thing
- e.g. misanalysing "kudos" as a plural noun.
9. U-shaped development
With almost everything, children's performance gets better as
they get older.
However, inflectional morphology appears to be different
- children appear to get worse before getting better.
This is what child psychologists call U-shaped
development.
Stage 1: children produce both regular and irregular past
tense forms with very few errors.
Stage 2: after a certain amount of time, the error rate
appears to increase significantly
- children start to make more and more errors, adding the
regular past tense suffix -ed to irregular verb stems
- even with verbs whose past tense forms they had
previously mastered.
Stage 3: the error rate slowly decreases, as the child get
older, until almost no errors are made.
U-shaped development fascinates child psychologists
- the sudden deterioration in performance appears to be
evidence for mental reorganisation
- the child has inferred a new generalisation involving
previously unrelated concepts
- e.g. that the -ed suffix on a verb signifies
"pastness".
10. Children versus adults
Stage 2 children and adults both have internalised the rule
which says "add -ed to form the past tense"
- but only children make overregularisation errors like
"bleeded" and "singed".
Therefore:
- Adults must have something that stage 2 children lack.
- Children must gradually attain whatever that thing is, as
they grow older.
First guess:
- Adults are able to communicate their thoughts more
clearly than children.
- Children improve their past tense verb performance by slowly
learning to communicate more clearly.
This theory is clearly wrong:
- There is nothing intrinsically unclear about
overregularised forms like "bleeded" and "singed"
- Many correct past tense forms are identical
to the verb stem, and hence are less clear than the
overregularised form, e.g. "hit", "cut", "put", "set"
Second guess:
- Adults don't say "bleeded" and "singed" because they
don't hear other adults saying these words.
But: adults are not just parrots
- they are capable of producing regular past tense forms
they have never heard before (e.g. "moshed", "Borked",
"faxed")
- so there is no reason why they can't do the same with
"bleeded" or "singed".
Third guess: Adults have learned the blocking principle
- The memorised "sang" blocks the past-tense suffixation
rule from applying to "sing"
The challenge for any cognitive science model of language processing is to actually demonstrate that it in some
way predicts blocking
11. Blocking
Why don't adults overgeneralise the -ed suffix to irregular
verb stems?
- not because they haven't heard the
overregularised forms (i.e. "bleeded" or "singed")
- but because they have heard the irregular forms
(i.e. "bled" and "sang").
Some component of adult psychology means that the experience
of having heard the irregular forms "bled" and "sang"
inhibits the application of the -ed rule to create
"bleeded" and "singed".
Previously, we called this "blocking". - the existence of
"bled" and "sang" in the mental lexicon blocks the
production of "bleeded" and "singed".
Two competing hypotheses:
- Children lack the blocking mechanism and
have to learn it.
- Children have the blocking principle but lack the
relevant experience needed to use it
effectively.
12. Learning how to block
How could a child learn the blocking principle from scratch?
They would need to learn explicitly that
overregularised forms like "bleeded" and "singed" are
ungrammatical.
- it is not enough that they haven't heard these forms
- they need to have negative evidence to solve the problem
- i.e. information about what is not in the language.
In other words, children would need to receive negative
feedback from somewhere whenever they used an incorrect past
tense form
- an explicit correction
- or some indirect signal of disapproval (a frown, a
puzzled look, a slap)
- A failure to achieve some non-linguistic goal
However, there is no evidence that negative feedback has any
effect on children's language acquisition.
The child psychologist Karin Stromswold has a dramatic example
of this
- a boy who has never talked (for some unknown neurological reason)
- but who has nonetheless learned the blocking principle
for past tense verbs forms
- this child cannot have learned this based on
parental correction to his own errors.
13. Blocking as innate knowledge
Maybe the blocking principle is part of innate linguistic
knowledge
- so children do not need to learn it from scratch.
Children don't learn the blocking principle from
evidence that "singed" is not in English
- rather they deduce that "singed" is not in English from
the blocking principle.
Why do adults use blocking more effectively than children are
able to?
- Because adults have more experience than children.
- In particular, they have heard irregular past tense verb
forms being used more often than children have.
- And memory retrieval improves through repetition.
Stage 2 children and adults are actually very similar
- they both have access to the innate blocking principle
- they both have irregular verb forms stored in the mental
lexicon
The difference is that adults can retrieve the
irregular verb forms from memory more quickly, and hence blocking
is more likely to happen.
In other words, children are "little adults with bad memories".
14. Little adults with bad memories?
Analysis of spontaneous conversations between children and
their parents supports the "little adults with bad memories"
theory.
Stage 2 children's average error rate for past tense verb
forms is actually surprisingly low
- only 4% of past tense forms produced by a child are incorrect
- adults tend to overestimate the error rate, since errors
are more noteworthy than correct forms.
No irregular verb is immune to overregularisation errors
- even verbs whose past tense form the child has already
produced correctly
Similarly, no verb is consistently erred on.
This suggests that children are not ignorant of
the correct irregular past tense forms
- they are just fallible when it comes to retrieving them.
The key factor appears to be parental input
- the more the parents use an irregular past tense verb
form, the less likely the child is to overregularise the
verb.
Children appear to know that their errors are errors
- they will correct an adult who uses an incorrect form,
even if they have just used that form themselves.
So, children appear to know the correct irregular past tense forms
- errors are runtime slip-ups caused by the fact that the
correct form cannot always be retrieved in real time.
15. Rule learning
What triggers the mental transition from stage 1 to stage 2?
- i.e. from making no over-regularisation errors to making some.
Simplest theory - this is the point at which the child has
learned the relevant rule.
Evidence shows that stage 1 children often use the base form
of a verb for both the present and past tense
- e.g. "Yesterday we walk"
- the transition from using the stem to using the regular
past tense form (i.e. "Yesterday we walked") coincides with the
start of over-irregularisation.
Also, the data suggests that, during the transition to stage
2, it is not necessarily the case that over-regularisations
(e.g. "singed") are driving out correct forms (i.e "sang")
- rather errors of commission (e.g. "Yesterday we singed")
are, as often as not, driving out errors of omission
(i.e. "Yesterday we sing").
- so focusing on the rate of over-regularisations by itself
may actually be misleading (rather than say the combined rate
of errors of omission and commission).
Finally, evidence from language development in identical and
fraternal twins suggests that the timing of the transition from
stage 1 to stage 2 is simply a matter of chance
- when one twin starts to over-regularise, an identical
twin is no more likely to follow suit than a fraternal
twin.
16. Pattern association memory
Maybe a rule is not necessary to explain over-regularisations
in children's acquisition of inflectional morphology
- maybe they simply analogise from verbs they already
know
- e.g. from correct forms like "folded", "molded", "scolded"
to over-regularisations like "holded".
This is the basis of Rumelhart and MacLelland's pattern
association memory model which we discussed last week
- the network learned hundreds of regular and irregular
past tense verb forms, using a simple two-layer perceptron
network
- it seemed to follow the U-shaped curve during
training.
How did R&M get their neural network to learn in such an
authentically child-like manner?
17. Discontinuous training
R&M made use of a couple of empirical observations:
- children learn common verbs first, and rarer verbs later
- i.e. they tend to learn irregular verbs before regular ones
- children's vocabulary grows very quickly all of a sudden,
a few months after they start learning words, i.e. at some
point they get a huge spurt of regular verbs.
With this in mind, R&M trained their network in a slightly extreme way:
- they first trained it on just 10 verbs, all at once, 8 of
which were irregular
- they then trained it on a further 410 verbs, again all at
once, 80% of which were regular.
Since pattern association memories are highly sensitive to
changes in the statistics of their input, it is not surprising
that errors rates increase dramatically at the start of the
second training phase, before recovering gradually.
18. Vocabulary expansion as trigger
There is one key question that we need to pose in order to
evaluate the R&M approach to learning:
- Do children begin to over-generalise in response to a
sudden influx of regular verbs?
Data from spontaneous conversations involving children shows
no evidence for this:
- a child's vocabulary appears to spurt about a year too
soon to account for the onset of over-regularisation
- i.e. mid-to-late ones versus mid-to-late twos
- when children start to make over-regularisation errors,
new regular verbs are actually coming in more
slowly than they were previously.
R&M's training regime is extremely fragile
- it requires a very particular combination of input
factors to mimic the child-like learning curve.
In reality, language learning is a fairly robust process
- children exhibit very similar learning curves, even with
vastly different patterns of data to learn from
- for example, children appear to learn English plural
inflection in a very similar way to past tense inflection,
despite the relative proportions of irregular to regular forms
being vastly different in the two cases.