|
TOPICS IN NATURAL LANGUAGE PROCESSING
|
Lecturer: Shay Cohen
Time and Location: Tuesday and Friday, 17:10-18:00, H.R.B Lecture Theatre, Rosbon Building (semester 2)
Office Hours: Tuesday, 14:30-15:30. Could also set an appointment.
Potential Topics for Fulfilling Course Requirements:
(See also advice about fulfilling the requirements here.)
Unsupervised learning:
- Combining Distributional and Morphological Information for Part of Speech Induction, EACL (2003). Link
- Natural Language Grammar Induction using a Constituent-Context Model, NIPS (2001). Link
- Unsupervised Induction of Semantic Roles, NAACL (2010). Link
Bayesian learning:
- Combining Multiple Information Types in Bayesian Word Segmentation, NAACL (2013). Link
- Bayesian Inference for PCFGs via Markov Chain Monte Carlo, NAACL (2007). Link
- Discovering Morphological Paradigms from Plain Text Using a Dirichlet Process Mixture Model, EMNLP (2011). Link
Topic modelling:
- Probabilistic Latent Semantic Analysis, UAI (1999). Link
- Probabilistic Topic Models (2007). Link
- Reading Tea Leaves: How Humans Interpret Topic Models, NIPS (2009). Link
Semantics:
- Automatic Labeling of Semantic Roles, CL (2002). Link
- A Generative Model for Parsing Natural Language to Meaning Representations, EMNLP (2008). Link
- Dependency-based Semantic Role Labeling of PropBank, EMNLP (2008). Link
Word embeddings:
- Word Embeddings: A Simple and General Method for Semi-supervised Learning, ACL (2010). Link
- Linguistic Regularities in Continuous Space Word Representations, NAACL (2013). Link
- Learning word embeddings efficiently with noise-contrastive estimation, NIPS (2013). Link
Neural networks:
- Natural Language Processing (Almost) from Scratch, JMLR (2011). Link
- Learning Continuous Phrase Representations and Syntactic Parsing with Recursive Neural Networks, NIPS (2010). Link
- Better Word Representations with Recursive Neural Networks for Morphology, CoNLL (2013). Link
Language modelling:
- KenLM: Faster and Smaller Language Model Queries, WSMT (2011). Link
- A Neural Probabilistic Language Model, JMLR (2003). Link
- Recurrent Neural Network Based Language Model, INTERSPEECH (2010). Link
Dependency parsing:
- Non-projective Depdendency Parsing Using Spanning Tree Algorithms, EMNLP (2005). Link
- Dependency Parsing of Turkish, CL (2008). Link
- An Efficient Algorithm for Projective Dependency Parsing, IWPT (2003). Link
There is also a book called "Dependency parsing" that gives more information about state of the art in dependency parsing. See note below about Morgan-Claypool books.
Word sense disambiguation:
- Unsupervised Word Sense Disambiguation Rivaling Supervised Methods, ACL (1995). Link
- An Experimental Study of Graph Connectivity for Unsupervised Word Sense Disambiguation, PAMI (2010). Link
- Word-Sense Disambiguation for Machine Translation, EMNLP (2005). Link
Discourse:
- The Penn Discourse Treebank 2.0, LREC (2008). Link
- Automatic Identification of General and Specific Sentences by Leveraging Discourse Annotations, IJCAI (2011). Link
- Minimally Supervised Event Causality Identification, EMNLP (2011). Link
Linear-algebraic and spectral learning:
- Spectral Learning for Non-deterministic Dependency Parsing, EACL (2012). Link
- Spectral Learning for Latent-Variable PCFGs, JMLR (2014). Link
- Two Step CCA: A New Spectral Method for Estimating Vector Models of Words, ICML (2012). Link
Morphology:
- Applying Conditional Random Fields to Japanese Morphological Analysis, EMNLP (2004). Link
- Computational Approaches to Morphology and Syntax, (2008). Link to review (This is a book.)
- Unsupervised Learning of the Morphology of Natural Language, CL (2001). Link
Meaning representations:
- Abstract Meaning Representation for Sembanking, LAWID (2013). Link
- Not an Interlingua, but Close: Comparison of English AMRs to Chinese and Czech, LREC (2014). Link
- Frame-Semantic Parsing, CL (2012). Link (long paper, look inside to find shorter conference versions)
- Universal Conceptual Cognitive Annotation, ACL (2013). Link
Learning of formal languages:
- Learning Regular Sets from Queries and Counterexamples, JIC (1987). Link
- Inductive Inference, DFAs and Computational Complexity, AII LNCS (1989). Link
- Planar Languages and Learnability, ICGI (2006). Link
Linguistics / corpus linguistics: Read this first. If you think you are more interested in presenting
papers that treat language from the angle of corpus linguistics, we can find papers together that are appropriate.
In addition, take a look at the Morgan and Claypool Synthesis Lectures on Human Language Technologies. They can be a valueable source for finding papers to survey, or finding
topics to begin with. Here is the current collection: here. If you are having trouble accessing a certain book, please let me know.
Other potential topics: finite state transducers and automata, machine translation (sub-topic), question answering, language-specific papers (such as papers covering NLP for Russian), constituency parsing, language acquisiton, phonology, Bayesian inference, grammar formalisms, summarisation, text compression, natural language generation. Search for these topics online and see if any of them interest you, and the papers available for them.
Back to main page.