The main textbook for this course is Speech and Language Processing by Jurafsky and Martin. You can purchase a copy of the 2nd International edition, which is relatively inexpensive, and we will need to refer to some chapters from that edition. A few copies are also on reserve in the library. However, where possible we will be using the chapters from the online draft 3rd edition, which contains more up-to-date content.
You are responsible for all material covered by the assigned reading, and many students find that it is useful to do the reading before lecture. Past students have requested that we prioritize readings, so high priority readings are marking with a (*). If you are really short on time you should focus on these and return to others later. But you are still expected to read everything eventually, and keeping up is still the best strategy!
In the schedule below, we use the following key to required readings:
In section references, when I say section 0 it refers to whatever introductory material comes before section 1.
In previous years some Informatics students have asked for more background reading on linguistics. A good place to start might be this text, which is available online through the University library:
Linguistic Fundamentals for Natural Language Processing: 100 Essentials from Morphology and Syntax by Emily M Bender. Synthesis Lectures on Human Language Technologies, June 2013, Vol. 6, No. 3 , Pages 1-184.
Other optional readings are provided below for students who wish to learn more details or about recent research related to each topic. Some of these papers may also give you ideas for your IRR review. Many of the optional readings assume additional mathematical or machine learning background beyond what is covered in this course, but you may be able to understand the general idea of these papers by reading the introduction and skimming the rest, even if you cannot understand all of the details.
In the schedule below, we use the following key to optional readings:
This schedule is subject to change! Check back frequently.
Videos of the lectures will be available through Learn, normally shortly after each lecture finishes.
Slides from last year are provided in advance where available. This year's slides will normally be posted the day before the lecture.
|SG||1. Course overview||slides, 2x2||JM2 Ch 1|
|SG||2. Morphology: Inflection, derivation, FSAs||slides, 2x2||JM2 2.0-2.1, 2.2 (*), 2.3-2.4, 3.0-3.1 (*)|
|SG||3. Morphology: Finite State Transducers, edit distance||slides, 2x2||JM2 3.2-3.7 (*), JM3 2.2-2.5 (except 2.4.3)worked edit distance example|
|SG||4. Probability estimation and probabilistic models||slides, 2x2||Basic Probability Theory tutorial|
|SG||5. Language modelling: N-gram models, entropy||slides, 2x2||JM3 3.0-3.3 (*)|
|SG||6. Language modelling: smoothing||slides, 2x2||JM3 3.4 (*), 3.5|
Probability and FSMs: Exercises.
|SG||7. Text Categorization: Naive Bayes models and evaluation||slides, 2x2||JM3 4.0-4.3 (*), 4.4-4.6, 4.7 (*)|
|SG||8. Part-of-speech Tagging and HMMs||slides, 2x2||JM3 8.0-8.4.4 (*), 8.7|
|SG||9. Algorithms for HMMs||slides, 2x2||JM3 8.4.5-8.4.6 (*), JM2 6.3-6.5|
Working with probability distributions.
Assignment 1 issued this week. See Learn (Assessment and Exams) for materials and due date.
|HT||10. Context-free grammar||slides||JM3 10.0-10.2 (*), 10.5, 11.0-11.1 (*)|
|HT||11. English syntax||slides||JM3 10.3 (*)|
|HT||12. Parsing algorithms||slides, 2x2||JM2 13.0-13.1 (*), JM3 11.0-11.2 (*)|
Tutorial:HMMs and tagging.
|HT||13. Probabilistic grammars and parsing||slides, 2x2||JM3 10.4 (*), 12.0-12.3 (*), 12.3-12.6.0, 12.8|
|HT||14. Catchup slot or TBD||slides, 2x2|
|SG||15. Dependency grammar and parsing||slides, 2x2||JM3 14.0-14.4.0 (*)|
Lab:Recursive Descent Parser.
|SG||16. Dependency parsing (2), logistic regression and discriminative models (1)||slides, 2x2||JM3 13.4.1-2 (*), 13.6, 5.0-5.1 (*), 5.6|
|SG||17. Exam preparation||slides, 2x2|
|SG||18. Logistic regression (2)||slides, 2x2||JM3 5.2-5.5|
Tutorial:Syntax and parsing.
Assignment 2 issued this week. See Learn (Assessment and Exams) for materials and due date.
|HT||20. Lexical semantics 1: Word senses, relations, and semantic roles||slides, 2x2||JM3 Ch 18-19 sections TBD|
|HT||21. Lexical semantics 2: semantic roles (cont), coocurrence and PMI||slides, 2x2,||JM3 6.0-6.4 (*), JM2 20.7.2 (*), 6.5 (?)|
Lab:Text classification and feature selection.
|HT||22. Lexical semantics 3: neural word embeddings||JM3 6.7-6.11 (?)|
|HT||23. Topic Models (?)||slides||Probabilistic topic models by David Blei|
|HT||24. Data, evaluation, and ethics 1|
Tutorial:Logistic regression and lexical semantics.
Assignment 3 issued this week. See Learn (Assessment and Exams) for materials and due date.
|HT||25. Data, evaluation, and ethics 2|
|SG||26. Sentence semantics: Meaning Representation||slides, 2x2||JM2 17.0-17.3.4 (*), 17.3.5, 17.4.0|
|SG||27. Sentence semantics: Syntax/semantics interface||slides, 2x2||JM2 18.0-18.3.0 (*)|
Lab:Sentiment analysis on Twitter.
|HT||28. Data, evaluation, and ethics 3||SS2 2.1-2.3|
|29. TBD||(Optional: see below)|
Informatics Forum, 10 Crichton Street, Edinburgh, EH8 9AB, Scotland, UK
Tel: +44 131 651 5661, Fax: +44 131 651 1426, E-mail: firstname.lastname@example.org
Please contact our webadmin with any comments or corrections. Logging and Cookies
Unless explicitly stated otherwise, all material is copyright © The University of Edinburgh