SOme notes on levels of representation of language*
An utterance is somehow constructed from what the speaker
wants to convey, and is interpreted by the hearer who reconstructs the
speaker's intended message. The processes involved in language
comprehension and understanding can
be described in terms of levels of structure: sounds, words, phrases,
sentences. Complex processes such as those involved in speech can be
decomposed into simpler ones. Comprehension and production may be
thought of as inverse processes
operating in opposite directions (this is an oversimplification: there
is evidence that comprehension may be simpler that production e.g. in
spelling and in learning to speak a second language).
Knowledge of language may be thought to be made up of rules for
manipulating different levels of structure. In comprehension,
- a sentence is heard (or read)
- it is analysed into phonemes(units of sound) e.g. /f/ /ow/ /n/;
- the phomeme sequence is analysed into morphemes (units of
meaning)
e.g. 'phon-' '-ing' '-ed';
- a dictionary (lexicon) is used to relate these to words;
- syntactic rules are used to analyse phrases and sentences;
- deductive and inferential rules are used for conclusions and to
draw
inferences from other knowledge.
Whilst this is a simplistic model it serves to suggest the components
needed in designing computer systems and in developing psychological
models. More formally we can structure and analyse language at a number
of different levels.
- Phonetics/Phonology: The level of speech sounds.
- Morphology
: The formation of words from their parts.
- Syntax
: The combination of words---grammar.
- Semantics
: The meaning of words, sentences and utterances.
- Discourse/Pragmatics/Speech Acts
: The structure of
collections of sentences, the use of language.
Phonetics and Phonology
Phonetics
is concerned with the sounds themselves, three perspectives:
- Articulatory: how we produce them;
- Acoustic: what they're actually like as sound waves;
- Auditory: how we perceive them.
Phonology is
about the relation between words and sounds. Consider the relationships
between the following:
innumerate immoderate
intolerant impossible incredible
unnecesary unmasked
untoward unbelievable uncouth
Or consider English plurals: boat---boats (/s/), bag---bags (/z/),
wish---wishes (/ez/).
This gets even tricker: roof---roofs
(/roovz/), wife---wives (/waivz/), wife---wife's (/waifs/).
Other suffixes do funny things too: electric (-/ik/)---electricity
(-/isitee/) (note the stress moves too), create (-/eit/)---creation
(-/aishun/).
morphology
Morphology addresses the level of
structure internal to the word. Not only are there restrictions on the
patterns of sound which make up a
word (almost all languages compose words from syllables, and with a few
exceptions all languages require a one-to-one correspondance between
syllables and vowels), but we can identify meaning-bearing units
smaller than words.
Inflectional
morphology: word forms for different versions of
the same underlying word:
- singular/plural
usually '-s', sometimes
`-ices' (`vertex', `index') or `-i' (`focus') or nothing (`deer'));
- past/non-past/3rd singular present/present participal
Usually `-ed', nothing, `-s', `-ing', but there are lots of more or
less irregular cases. Compare `quack', `eat', `do' and `is'.
- past/non-past/3rd singular present/present participal
Usually `-ed', nothing, `-s', `-ing', but there are lots of more or
less irregular cases. Compare `quack', `eat', `do' and `is'.
English is very modest in this area: verbs in Spanish, for example,
have about 50 inflected forms, Ancient Greek 350, and many Amerindian
languages have 10s of thousand of forms for verbs.
Derivational
morphology: New words from old.
affixation
`un-', `re-', `multi-', `-ise', `-able', . . .
Note that the first two don't change word class (new verbs from old)
but the others do (`part' is a noun, `multipart' is an adjective).
Some but not all affixes combine and even
iterate---`reunionisation'.
Other languages also use infixation and reduplication.
simple juxtaposition
Without spaces (`toothbrush'), with hyphens
(`toothbrush-holder') or with spaces (``toothbrush-holder box label
loss enquiry'' or even ``repair manual binder
delivery van repair manual . . .'').
Different languages have different
preferences---German rarely needs spaces, just an occasional `s',
where French doesn't like the process much at all (e.g. ``sack a
main'' for `handbag').
syntax
Syntax is the scientific name for
grammar, the structuring of
words into sentences in a given language.
Just producing words in any old way isn't good enough, in general, to
be seen/heard to be using a language correctly.
Different languages
do things differently, but all are trying to
organise things by a convention of use so that
hearers/readers
can tell what speakers/writers meant:
- Who did what to whom;
- What goes with what.
Some languages use the order in which words appear to manage things:
English: Robin kissed Kim
I gave the children cold sandwiches
French: Robin a baissé Kim
Italian: Ho dato ai bambini congelati panini
Some languages use
adpositions
to sort things out:
English prepositions: The funeral took
place today
in
Leicester
of the two victims . . .
Japanese postpositions: Watashi
no
kodoma
wa hon
o yomimasu
Or languages may use
inflection to do the job:
Latin: Puell
am bon
am naut
a amat
Russian: devochk
u horosh
uyu
matros liubil
semantics
What do words mean,
how do they mean, and how is this
related to
what sentences mean and how utterances are interpreted?
Before A.I., most discussion of this focussed on the
relation
between meanings, both within and across levels, using one form or
another of logic.
Speaking carefully,
Sentences are abstract, names for
types of utterances.
Utterances are concrete specific examples
(
tokens) of the actual use, spoken or written, of sentences.
We say that sentences have meaning in the abstract, while utterances
have concrete interpretations.
Consider the sentence:
Last week I arrived on Tuesday before
leaving on Monday.
This is always
false, regardless of when,
where and by whom it
is uttered.
On the other hand
the truth or falsity of:
Last week
I went to the Picnic Basket three times
can only be
determined on an
utterance-by-utterance basis.
Pragmatics
Discourse and
Dialogue
How can utterances be used in a discourse or dialogue? Some obviously
important aspects of language operate above the single
utterance level. Reference in general and pronouns in particular are
the most obvious examples:
Robin and Kim went to see 2001
last week. They thought it was great, but
the cinema was nearly empty.
Not just any sequence of question and answer is acceptable or
useful
in a dialogue. Where
you are in a discourse affects how you should say things:
What
did you give to Robin? It was Robin I gave the sandwiches to..
Dialogue is in any case much more than just question and answer---just
how we manage to orchestrate our talk so that we make the most of the
rather narrow channel we share is a major open question.
Speech Acts and
Planning
We do not talk or write just for the sake of filling time---we
use language to
do things.
Speaking is an action, and like
other actions it is usually instrumental, performed in service of
achieving some goal. If I say
Please open the window or
When does the next shuttle bus leave for Kings
Buildings?
I am using
language to get things done, or to get the information I need to get
things done.
We call the different kinds of things we can do with utterances
speech acts: Requests, Statements, Questions, Commands and
Commissives are the main types.
The only non-obvious one is
Commissive---that's when by right of some specific authority you can
actually make something happen by speaking:
I hereby christen
this ship the S.S. Rustbucket. or
You're out!.
*
This note comes from material produced by various authors who taught
Introduction to Natural Language Processing in the Artificial
Intelligence Department, University of Edinburgh, in the past