Accelerated Natural Language Processing
Reading List
Much of the course assumes you are familiar with the mathematical notation and concepts covered by the tutorials on the following page. If you are not, you should plan to work through them in the first two weeks of class.
The course will use the following textbook:
- Daniel Jurafsky and James H. Martin (2009). Speech and Language
Processing (2nd Edition). Prentice Hall.
We will also use some sections of the following textbook, which goes into more depth but does not cover all the topics we want. The link to the online version works only from inside the University network, or through VPN.
The following papers serve as background and/or supplementary reading for specific topics:
- Eugene Charniak and Mark Johnson (2005).
Coarse-to-Fine n-Best Parsing and MaxEnt Discriminative Reranking.
Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, 173-180, Ann Arbor, Michigan. Association for Computational Linguistics.
- Vincent Ng (2010).
Supervised
Noun Phrase Coreference Research: The First Fifteen Years.
Proceedings of the 48th Annual Meeting of the Association for
Computational Linguistics, 1396-1411. Uppsala, Sweden.
- Joakim Nivre (2005).
Dependency
Grammar and Dependency Parsing. MSI report 05133. Vaxjo
University, School of Mathematics and Systems Engineering.
- Karthik Raghunathan, Heeyoung
Lee, Sudarshan Rangarajan, Nate Chambers, Mihai Surdeanu, Dan
Jurafsky and Christopher Manning (2010).
A
Multi-Pass Sieve for Coreference Resolution. Proc. EMNLP,
pp. 491-501, Cambridge MA.
- Bonnie Webber, Markus Egg and Valia Kordoni (2012).
Discourse
Structure and Language Technology. Journal of Natural Language Engineering 18(4), 437--490.
Supplementary materials for tagging and parsing
Edinburgh's Informatics 2a course covers some of the same material as this course, but from a different perspective. In particular, lectures 15-27 cover HMMs, grammars and parsing algorithms, and semantic parsing. Be aware that the course is aimed at computer scientists and the first part of the course is about formal language theory, so some topics are explained in that context.
Supplementary materials for Python programming
For more practice with programming, consider going through an online Python tutorial, such as the one from CodeAcademy.
If you are already fluent in one or more programming languages, but not in Python, probably the best way to pick up what you need to know is by going through the official Python tutorial
Supplementary materials for Linguistics
For those wanting additional background reading on basic linguistics topics, there are several textbooks available from the library, including the following:
- Yule (2014). The Study of Language (5th Edition).
- Fromkin (2000). Linguistics: an introduction to linguistic theory.