Google translate instantly
translates between any pair of over eighty human languages
like French and English. How does it do that? Why does it
make the errors that it does? And how can you build something
better? Modern translation systems like Google Translate,
and SDL FreeTranslation
learn to translate by reading millions of words of already
translated text. This course will show you how they work.
We cover fundamental building blocks from linguistics,
machine learning, algorithms, data structures, and formal
language theory, showing how they apply to a real and difficult
problem in artificial intelligence.
Spring 2016 information
- Monday & Thursday, 16:10 to 17:00
- 7 George Square, room S.1
- Adam Lopez
- Office hours
- Immediately after class meetings or by appointment
- Discussion Forum
- Statistical Machine Translation
by Philipp Koehn.
or on loan from the University library.
- Course Descriptor
- INFR11062 | MT
- Three practical assignments (10% each)
- Alignment, due 28 Jan 16:00
- Decoding, due 11 Feb 16:00
- Reranking, due 7 Mar 16:00
- Final exam (70%)
The course will follow the school-wide late coursework policy
and academic conduct policy.
- The course assumes you have taken ANLP or equivalent. Machine translation applies concepts from computer science, statistics, and linguistics. You needn’t be an expert in all three of these fields (few people are), but if you are allergic to any of them you should not take this course. Concretely, you will be expected to already understand the following topics before taking the course, or be prepared to learn them independently.
- Discrete mathematics: analysis of algorithms, dynamic programming, basic graph algorithms, finite and pushdown automata.
- Other essential maths: basic probability theory; basic calculus and linear algebra; ability to read and manipulate mathematical notation including sums, products, log, and exp.
- Programming: ability to read and modify python programs; ability to design and implement a function based on high-level description such as pseudocode or a precise mathematical statement of what the function computes.
- Linguistics: basic elements of linguistic description.