Advanced Natural Language Processing 2014
General Topics/Concepts
These are concepts and issues that are relevant to more than one part
the course. You should be able to briefly discuss each, with respect
to what it is relevant to.
- Zipf's Law
- Ambiguity
- Open-class Words, Closed-class Words
- Sparse Data
- Bayes' Rule
- Prior
- Likelihood
- Maximum Likelihood Estimation
- Training, development, and test sets
- Evaluation measures and significance
- Noisy channel model
- Dynamic programming
- Logistic Regression/MaxEnt models (high-level ideas)
Lexical Processing
1. Concepts
You should be able to explain each of these concepts, give
one or two examples where appropriate, and its pros and cons (again
where appropriate).
- Tokenization
- Morphology: Stems, Affixes, Root, Lemma
- Inflectional Morphology
- Derivational Morphology
- Lexical compound
- Finite State Machine
- Regular Language and Regular Expression
- N-Gram Language Model
- Perplexity
- Add-One / Add-Alpha Smoothing
- Good-Turing Smoothing (for GT, KN, and backoff, you do not need to memorize the formulas, but should understand the conceptual differences, and should be able to *use* the formulas if they are given to you.)
- Kneser-Ney Smoothing
- Backoff
- Interpolation
- Spelling Correction
- Part-of-Speech and PoS-tagging
- Markov Model
- Hidden Markov Model
- Text categorization
2. Types of methods
For types of methods, you should be able to describe their essential elements.
- Expectation-Maximization algorithm
2. Specific methods
For specific methods, you should know what each algorithm is used for and be able to hand simulate it (e.g., if given a toy example or tiny data set).
- Finding minimum string edit distance
- Viterbi Algorithm for Hidden Markov Models
- Forward Algorithm for Hidden Markov Models (to compute likelihood)
- Naive Bayes classifier for document categorization
Grammars and Parsing
1. Concepts
You should be able to explain each concept, give
one or two examples where appropriate, and its pros and cons (again
where appropriate).
- Context-Free Grammar
- Bounded and unbounded dependencies
- Dependency grammar
- Search space: What is one searching for in parsing and what is
one searching through?
- Breadth-first search, depth-first search, and differences
between them
- Parsing as dynamic programming: What problems is one solving and
how does this differ from breadth-first and depth-first search?
- Prediction in parsing (as in chart parser with Dotted Rules)
- Difference between recognition and parsing
- Probabilistic Context-Free Grammar
- Head Words (in Syntax)
- PARSEVAL: precision, recall
- Inside cost
- Discriminative reranking
2. Methods
For (types of) methods, you should be able to describe their essential elements.
- Top-Down parsing
- Bottom-Up parsing
- Well-formed Substring Tables
3. Specific Methods
For specific methods, you should be able to work through examples
in some detail.
- Recursive descent parsing
- CKY parsing
- Active chart parsing
- Best-first search
- Beam search
Semantic Processing
1. Concepts
You should be able to explain each of these concepts, give
one or two examples where appropriate, and its pros and cons (again
where appropriate).
- Word Senses and Word Sense Disambiguation
- Relations between word senses (synonym, hypernym, hyponym, similarity)
- WordNet
- Thematic Roles
- Question Answering
- Sentiment Analysis
- Distributional hypothesis
- Context vector
- Pointwise mutual information
- Collocations
- Similarity Metric
- Meaning representations (MR)
- First Order Logic
- Canonical Form
- Verifiability
- Compositionality
- Expressivity
- Quantifiers and quantifier scoping
- Lambda expression
- Reification of events
- Semantic attachments
2. Types of methods
For types of methods, you should be able to describe their essential elements.
- Syntax Driven Semantic Analysis
3. Specific methods
For specific methods, you should be able to hand simulate each one.
- Naive Bayes classifier for WSD
- Lambda reduction
- Compositional Semantics: parsing with semantic attachments
Discourse Processing
1. Concepts
You should be able to explain each of these concepts, give
one or two examples where appropriate, and its pros and cons (again
where appropriate).
- Reference
- Coreference
- Anaphora
- Constraints on anaphor binding
- Preferences in anaphor resolution
- Cohesion in Discourse
- Lexical chains
- Coherence
- Coherence relations
- Discourse segments
- Rhetorical Structure Theory
- Discourse connectives
2. Specific methods
For specific methods, you should be able to describe their main characteristics.
- Gross&Sidner discourse segmentation, Centering
- Hobbs algorithm
- TextTiling
Machine Translation
1. Concepts
You should be able to explain each of these concepts, give
one or two examples where appropriate, and its pros and cons (again
where appropriate).
- Transfer
- Interlingua
- Vauquois Triangle
2. Type of method