ANLP 2019 Schedule and materials

Lecture schedule (esp later in semester) is subject to change. Please check frequently for updates.

Schedule: quick links

Required reading

The main textbook for this course is Speech and Language Processing by Jurafsky and Martin. You can purchase a copy of the 2nd International edition, which is relatively inexpensive, and we will need to refer to some chapters from that edition. A few copies are also on reserve in the library. However, where possible we will be using the chapters from the online draft 3rd edition, which contains more up-to-date content.

You are responsible for all material covered by the assigned reading, and many students find that it is useful to do the reading before lecture. Past students have requested that we prioritize readings, so high priority readings are marking with a (*). If you are really short on time you should focus on these and return to others later. But you are still expected to read everything eventually, and keeping up is still the best strategy!

In the schedule below, we use the following key to required readings:

In section references, when I say section 0 it refers to whatever introductory material comes before section 1.

Optional reading

Linguistics background

In previous years some Informatics students have asked for more background reading on linguistics. A good place to start might be this text, which is available online through the University library:

Linguistic Fundamentals for Natural Language Processing: 100 Essentials from Morphology and Syntax by Emily M Bender. Synthesis Lectures on Human Language Technologies, June 2013, Vol. 6, No. 3 , Pages 1-184.

Further mathematical details

Some students may want a more rigourous treatment of the models and machine learning methods we discuss. In that case I suggest the following textbook. It covers many of the same topics we do, but assumes somewhat more background and comfort with formal methods.

Introduction to Natural Language Processing by Jacob Eisenstein. MIT Press, 2019. (Draft version is available for free from author's github page here.)

Weekly optional readings

Other optional readings related to each week's topics are provided below for students who wish to learn more details, especially about recent research in the area. Some of these papers may also give you ideas for your IRR review. Many of the optional readings assume additional mathematical or machine learning background beyond what is covered in this course, but you may be able to understand the general idea of these papers by reading the introduction and skimming the rest, even if you cannot understand all of the details.

In the schedule below, we use the following key to optional readings:

Full Schedule

Videos of the lectures will be available through Learn, normally shortly after each lecture finishes.

Slides from last year are provided in advance where available. This year's slides will normally be posted the day before the lecture.

Week 1, starting 16 Sep

Who? Lecture topics Slides Reading
SG1. Course overview slides, 2x2 JM2 Ch 1
SG2. Morphology: Inflection, derivation, FSAs slides, 2x2 JM2/3 2.0-2.1 (*), JM2 2.2 (*); JM2 2.3-2.4, 3.0-3.1 (*)
SG3. Morphology: Finite State Transducers, edit distance slides, 2x2 JM2 3.2-3.7 (*); JM3 2.2-2.5 (except 2.4.3, 2.4.5) worked edit distance example

Lab:

UNIX tools for text processing. html, pdf. Solutions

Optional reading:

Week 2, starting 23 Sep

Who? Lecture topics Slides Reading
SG4. Probability estimation and probabilistic models slides, 2x2 Basic Probability Theory tutorial
SG5. Language modelling: N-gram models, entropy slides, 2x2 JM3 3.0-3.3 (*)
SG6. Language modelling: smoothing slides, 2x2 JM3 3.4 (*), 3.5

Tutorial:

Probability and FSMs. Exercises. Solutions.

Optional reading:

Week 3, starting 30 Sep

Who? Lecture topics Slides Reading
SG7. Text Categorization: Naive Bayes models and evaluation slides, 2x2 JM3 4.0-4.3 (*), 4.4-4.6, 4.7 (*)
SG8. Part-of-speech Tagging and HMMs slides, 2x2 JM3 8.0-8.4.4 (*), 8.7
SG9. Algorithms for HMMs slides, 2x2 JM3 8.4.5-8.4.6 (*), JM3 Appendix A.2-A.5

Lab:

Working with probability distributions. html, pdf. Solutions

Homework assignment:

Assignment 1: Language modelling.

Optional reading:

Week 4, starting 7 Oct

Who? Lecture topics Slides Reading
SG10. Data, Evaluation, Implications (1): dialect and discrimination slides, 2x2 Blodgett and O'Connor (2017).
SC11. Syntax and Context-free grammar, ambiguity slides, 2x2 JM3 12.0-12.2 (*), 12.5
SC12. English syntax, agreement, parsing slides, 2x2 JM3 12.3 (*), glossary of categories

Tutorial:

HMMs and tagging. Exercises, Solutions

Optional reading (dialect and discrimination):

Optional reading (syntax):

Week 5, starting 14 Oct

Who? Lecture topics Slides Reading
SC13. Parsing as search: recursive descent and CKY slides, 2x2 JM3 13.0-13.2 (*)
SC14. Treebanks and statistical parsing slides, 2x2 JM3 12.4, 14.0-14.3 (*), 14.4-14.6.0, 14.8
SC15. Dependency syntax and parsing slides, 2x2 JM3 15.0-15.3 (*)

Lab:

Recursive Descent Parser. html, pdf, -Solutions

Optional reading:

Week 6, starting 21 Oct

Who? Lecture topics Slides Reading
SC16. Dependency parsing and logistic regression slides, 2x2 JM3 15.4 (*), 15.6, 5.0-5.1 (*)
SC17. Exam preparation, midesemester feedback slides, 2x2
SC18. Grammar writing exercise Instructions on how to prepare for class

Tutorial:

Syntax and parsing. Exercises. Solutions

Optional reading:

Week 7, starting 28 Oct

Who? Lecture topics Slides Reading
SC19. Logistic regression (cont) slides, 2x2 JM3 5.2-5.5, JM2 6.7
SC20. Lexical semantics 1: Word senses, relations, disambiguation slides, 2x2 JM3 6.0-1 (*), Appendix C.0-2 (*). JM3 C.4-5, C.8-9.
SC21. Lexical semantics 2: vector models, co-ocurrence and PMI slides, 2x2 JM3 6.2-4 (*), JM3 6.7 (*), JM2 20.7.2

Lab:

Text classification and feature selection. html, pdf. Solutions

Optional reading:

Week 8, starting 4 Nov

Who? Lecture topics Slides Reading
SC22. Lexical semantics 3: tf-idf, dense vectors for word embeddings slides, 2x2 JM3 6.5,6.8,6.10 (*) 6.6
SG23. Data, evaluation, implications (2): use and collection of human data, including social media and assignment 2 slides, 2x2 School research ethics procedure
SG24. Data, evaluation, implications (3): evaluation, claims, and evidence slides, 2x2 correlation, 2x2

Tutorial:

Logistic regression and lexical semantics. Exercises. Solutions

Homework assignment:

Distributional similarity

Optional reading:

Week 9, starting 11 Nov

Who? Lecture topics Slides Reading
JWGuest lecture on ethical issues in NLP
SG26. Sentence semantics: Meaning Representation slides, 2x2 JM3 16.0-16.3.4 (*), 16.3.5, 16.4.0
SG27. Sentence semantics: Syntax/semantics interface slides, 2x2 JM2 18.0-18.3.0 (*)

Lab:

Sentiment analysis on Twitter. html, pdf, Solutions

Optional reading:

Week 10, starting 18 Nov

Who? Lecture topics Slides Reading
SG28. Coreference resolution slides, 2x2 JM3 22.0-22.2 (*), 22.9
SG29. Gender bias (esp in coreference) slides, 2x2 JM3 22.10*, Zhao et al (2019)
SGno lecture

Tutorial:

Semantics. Exercises. Solutions

Optional reading:


Home : Teaching : Courses : Anlp 

Informatics Forum, 10 Crichton Street, Edinburgh, EH8 9AB, Scotland, UK
Tel: +44 131 651 5661, Fax: +44 131 651 1426, E-mail: school-office@inf.ed.ac.uk
Please contact our webadmin with any comments or corrections. Logging and Cookies
Unless explicitly stated otherwise, all material is copyright © The University of Edinburgh