Inf1 CogSci 2012: Lecture 19: Artificial neural networks, child development and the 'U-shaped curve'

Mark McConville
Henry S. Thompson
6 March 2012
Creative CommonsAttributionShare Alike

1. Parallel distributed processing

Starting in the 1980s, Rumelhart and McClelland promoted the multilayer feed-forward perceptron

One of their early targets was. . .

And they didn't even use hidden units

2. McClelland and Rumelhart and English past tense verbs

Their model was pretty radical

The input to the network is the verb's base form, e.g. [dans], [sInk]

The output from the network should be the past tense form, i.e. [danst], [sank]

[no description, sorry]

3. McClelland and Rumelhart: the details

The design of the input and output was crucial

M&R followed Chomsky and Halle, and used features

Both input and output were labelled with 460 different such triples

[no description, sorry]

They trained it with 420 input/output pairs

4. And the result was. . .

After the 84,000 training iterations, the settled network worked well for almost all the 420 verbs in the training set

It performed adequately for a separate test set of 86 other verbs

During training, the network behaved in an impressively childlike manner

More after we recapitulate the human data

5. Overregularisations

Young children make errors with past tense forms.

In particular, they overregularise

Children appear not only to learn rules, but also to overapply them.

However, the patterns of error that children make with past tense forms is actually highly complex

We now shift from static language description to development:

6. Stages of language acquisition

Around 18 months, children start to produce two-word microsentences

Some of these are original productions, rather than telegraphic renditions of their parents' speech, e.g.

Around 2 years, children start producing longer, more complicated sentences.

Simultaneously, they start to use grammatical morphemes

7. Stages of language acquisition (ctd.)

Around 3 years, children start to make errors, by attaching the past tense suffix -ed to irregular verb stems (e.g. "singed", "bleeded").

Around the same time, they start to pass the "wug" test, in experimental settings:

Children don't just make errors adding -ed to irregular verb stems

8. Overzealous grammarians

Children don't just overgeneralise from regular past tense forms

Children are overzealous grammarians, finding regularity in the oddest places

Note that adults occasionally do the same thing

9. U-shaped development

With almost everything, children's performance gets better as they get older.

However, inflectional morphology appears to be different

This is what child psychologists call U-shaped development.

Stage 1: children produce both regular and irregular past tense forms with very few errors.

Stage 2: after a certain amount of time, the error rate appears to increase significantly

Stage 3: the error rate slowly decreases, as the child get older, until almost no errors are made.

U-shaped development fascinates child psychologists

10. Children versus adults

Stage 2 children and adults both have internalised the rule which says "add -ed to form the past tense"


First guess:

This theory is clearly wrong:

Second guess:

But: adults are not just parrots

Third guess: Adults have learned the blocking principle

The challenge for any cognitive science model of language processing is to actually demonstrate that it in some way predicts blocking

11. Blocking

Why don't adults overgeneralise the -ed suffix to irregular verb stems?

Some component of adult psychology means that the experience of having heard the irregular forms "bled" and "sang" inhibits the application of the -ed rule to create "bleeded" and "singed".

Previously, we called this "blocking". - the existence of "bled" and "sang" in the mental lexicon blocks the production of "bleeded" and "singed".

Two competing hypotheses:

12. Learning how to block

How could a child learn the blocking principle from scratch?

They would need to learn explicitly that overregularised forms like "bleeded" and "singed" are ungrammatical.

In other words, children would need to receive negative feedback from somewhere whenever they used an incorrect past tense form

However, there is no evidence that negative feedback has any effect on children's language acquisition.

The child psychologist Karin Stromswold has a dramatic example of this

13. Blocking as innate knowledge

Maybe the blocking principle is part of innate linguistic knowledge

Children don't learn the blocking principle from evidence that "singed" is not in English

Why do adults use blocking more effectively than children are able to?

Stage 2 children and adults are actually very similar

The difference is that adults can retrieve the irregular verb forms from memory more quickly, and hence blocking is more likely to happen.

In other words, children are "little adults with bad memories".

14. Little adults with bad memories?

Analysis of spontaneous conversations between children and their parents supports the "little adults with bad memories" theory.

Stage 2 children's average error rate for past tense verb forms is actually surprisingly low

No irregular verb is immune to overregularisation errors

Similarly, no verb is consistently erred on.

This suggests that children are not ignorant of the correct irregular past tense forms

The key factor appears to be parental input

Children appear to know that their errors are errors

So, children appear to know the correct irregular past tense forms

15. Rule learning

What triggers the mental transition from stage 1 to stage 2?

Simplest theory - this is the point at which the child has learned the relevant rule.

Evidence shows that stage 1 children often use the base form of a verb for both the present and past tense

Also, the data suggests that, during the transition to stage 2, it is not necessarily the case that over-regularisations (e.g. "singed") are driving out correct forms (i.e "sang")

Finally, evidence from language development in identical and fraternal twins suggests that the timing of the transition from stage 1 to stage 2 is simply a matter of chance

16. Pattern association memory

Maybe a rule is not necessary to explain over-regularisations in children's acquisition of inflectional morphology

This is the basis of Rumelhart and MacLelland's pattern association memory model which we discussed last week

How did R&M get their neural network to learn in such an authentically child-like manner?

17. Discontinuous training

R&M made use of a couple of empirical observations:

With this in mind, R&M trained their network in a slightly extreme way:

Since pattern association memories are highly sensitive to changes in the statistics of their input, it is not surprising that errors rates increase dramatically at the start of the second training phase, before recovering gradually.

18. Vocabulary expansion as trigger

There is one key question that we need to pose in order to evaluate the R&M approach to learning:

Data from spontaneous conversations involving children shows no evidence for this:

R&M's training regime is extremely fragile

In reality, language learning is a fairly robust process