Papers for Presentation
Here is a list of suggested papers for the paper presentation.
You may choose a paper not on the list, but you must consult with
the lecturer or the TA
over email about it.
The papers labeled below as background reading are considered
unsuitable for presentation, but could give you some more insight
into the problem domain of the other papers.
Automated Recommender Systems
- Roth, Maayan, et al,
Suggesting friends using the implicit social graph, in Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
(New York, NY, USA: ACM, 2010), 233-242.
- Yehuda Koren,
Factorization meets the neighborhood: a multifaceted collaborative
filtering model, in Proceeding of the 14th ACM SIGKDD
international conference on Knowledge discovery and data mining
(Las Vegas, Nevada, USA: ACM, 2008), 426-434.
- Robert Bell, Yehuda Koren, and Chris Volinsky,
Modeling relationships at multiple scales to improve accuracy of
large recommender systems, in Proceedings of the 13th ACM
SIGKDD international conference on Knowledge discovery and data
mining (San Jose, California, USA: ACM, 2007), 95-104. Bell and
Koren are two of the authors of the prize-winning Netflix
system.
- Ruslan Salakhutdinov, Andriy Mnih, and Geoffrey Hinton,
Restricted Boltzmann machines for collaborative filtering, in
International Conference on Machine Learning (ICML), 2007, 791-798.
A bit hard, but this is another one of the key technologies behind the Netflix prize submissions.
-
Platt et al., Learning a Gaussian Process Prior for Automatically Generating Music Playlists. Advances in Neural Information
Processing Systems (NIPS). 2002.
-
Kai Yu, Anton Schwaighofer, Volker Tresp, Wei-Ying Ma and HongJiang Zhang, Collaborative Ensemble Learning: Combining Collaborative and Content-Based Information Filtering via Hierarchical Bayes in UAI 2003.
-
Mingqing Hu and Bing Lu., Mining and Summarizing Customer Reviews
in KDD 2004
Also, anyone interested in a presentation on Recommender Systems from a historical perspective, could check these:
Document Clustering, Classification and Analysis
- Rosen-Zvi, Griffiths, Steyvers, and Smyth., The Author-Topic Model for Authors and Documents.. Conference on Uncertainty in Artificial Intelligence (UAI). 2004. An early and instructive example
of how latent Dirichlet allocation can be customized to incorporate extra information.
- David M. Blei and John D. Lafferty,
Dynamic topic models, in Proceedings of the 23rd International
Conference on Machine learning (Pittsburgh, Pennsylvania: ACM,
2006), 113-120.
- Dafna Shahaf and Carlos Guestrin., Connecting the Dots Between News Articles. ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) 2010.
- Xiaohua Hu et al.,
Exploiting Wikipedia as external knowledge for document
clustering, in Proceedings of the 15th ACM SIGKDD international
conference on Knowledge discovery and data mining (Paris, France:
ACM, 2009), 389-396.
- Jun Zhu et al.,
Simultaneous record detection and attribute labeling in web data
extraction, in Proceedings of the 12th ACM SIGKDD international
conference on Knowledge discovery and data mining (Philadelphia,
PA, USA: ACM, 2006), 494-503.
Making and combining clasifiers
- Rajat Raina et al.,
Self-taught learning: transfer learning from unlabeled data, in
Proceedings of the 24th International Conference on Machine
learning (Corvalis, Oregon: ACM, 2007), 759-766.
- Yoshua Bengio et al.,
Curriculum learning, in Proceedings of the 26th Annual
International Conference on Machine Learning (Montreal, Quebec,
Canada: ACM, 2009), 41-48.
- Nicolò Cesa-Bianchi, Claudio Gentile, and Luca
Zaniboni,
Hierarchical classification: combining Bayes with SVM, in
Proceedings of the 23rd international conference on Machine
learning (Pittsburgh, Pennsylvania: ACM, 2006), 177-184.
Pattern Discovery
- Fosca Giannotti et al.,
Trajectory pattern mining, in Proceedings of the 13th ACM
SIGKDD international conference on Knowledge discovery and data
mining (San Jose, California, USA: ACM, 2007), 330-339.
Social networks
- Bee-Chung et al.,
User reputation in a comment rating environment, in Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining (New York, NY, USA: ACM, 2011), 159-167.
- Lars Backstrom et al.,
Group formation in large social networks: membership, growth, and
evolution, in Proceedings of the 12th ACM SIGKDD international
conference on Knowledge discovery and data mining (San Diego, California, USA 2011), 159-167.
- Jure Leskovec et al.,
Microscopic evolution of social networks, in Proceeding of the
14th ACM SIGKDD international conference on Knowledge discovery and
data mining (Las Vegas, Nevada, USA: ACM, 2008), 462-470.
- Shen et al. Latent Friend Mining from Blog Data. International Conference on Data Mining. 2006.
- Tan, C., Lee, L., Tang, J., Jiang, L., Zhou, M. and Li, P. User-Level Sentiment Analysis
Including Social Networks. KDD 2011.
Web mining
- El-Arini, Khalid and Guestrin, Carlos, Beyond keyword search: discovering relevant scientific literature in Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining (New York, NY, USA: 2011), 439-447
- Leskovec, Backstrom, and Kleinberg., Meme-tracking and the dynamics of the news cycle, KDD 2009. (A data mining approach to tracking memes in blogs and mainstream media.)
- Huanhuan Cao et al.,
Context-aware query suggestion by mining click-through and session
data, in Proceeding of the 14th ACM SIGKDD international
conference on Knowledge discovery and data mining (Las Vegas,
Nevada, USA: ACM, 2008), 875-883.
- Ricardo Baeza-Yates and Alessandro Tiberi,
Extracting semantic relations from query logs, in Proceedings
of the 13th ACM SIGKDD international conference on Knowledge
discovery and data mining (San Jose, California, USA: ACM, 2007),
76-85.
- Filip Radlinski, Robert Kleinberg, and Thorsten Joachims,
Learning diverse rankings with multi-armed bandits, in
Proceedings of the 25th international conference on Machine
learning (Helsinki, Finland: ACM, 2008), 784-791.
Background Reading:
Natural Language Processing
- Frustratingly Easy Domain Adaptation by Hal Daume. ACL 2007
- Shallow Parsing
with Conditional Random Fields. Fei Sha and Fernando Pereira.
Proceedings of Human Language Technology-NAACL 2003
- A
Probabilistic Framework for Semi-Supervised Clustering by
Sugato Basu, Mikhail Bilenko and Raymond J. Mooney. In KDD 2004.
Web Mining (and other text analysis)
- Snowsill, Tristan Mark et. al.
Refining causality: who copied from whom? in Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining (San Diego, California, USA: ACM, 2011), 466-474
- D. Sculley,
Combined regression and ranking, in Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
(New York, NY, USA: ACM, 2010), 979-988.
- Tracking
evolving communities in large linked networks by John Hopcroft
and Omar Khan and Brian Kulis and Bart Selman. In PNAS 101 suppl. 1, 2004.
- Justin Ma et al., Beyond
blacklists: learning to detect malicious web sites from suspicious
URLs, in Proceedings of the 15th ACM SIGKDD international
conference on Knowledge discovery and data mining (Paris, France:
ACM, 2009).
Bioinformatics
- A
Hierarchical Bayesian Markovian Model for Motifs in Biopolymer
Sequences by E.P. Xing, M.I. Jordan, R.M. Karp and S. Russell.
In Advances in Neural Information Processing Systems 15 ( NIPS2002). A longer version
is available from Eric's
home page.
Computer Vision
- Semi-Supervised
Learning in Gigantic Image Collections (2009) Rob Fergus, Yair Weiss,
Antonio Torralba
- Segmenting
Scenes by Matching Image Composites (2009) Bryan Russell, Alyosha
Efros, Josef Sivic, Bill Freeman, Andrew Zisserman
- Beyond
Categories: The Visual Memex Model for Reasoning About Object
Relationships (2009) Tomasz Malisiewicz, Alyosha Efros
Image Retrieval by Content, Image Analysis
-
Blobworld: Image segmentation using Expectation-Maximization and its
application to image querying by Chad Carson, Serge Belongie,
Hayit Greenspan, and Jitendra Malik. In: IEEE Transactions on
Pattern Analysis and Machine Intelligence, 24(8):1026-1038, August 2002.
-
Object Recognition with Informative Features and Linear Classification
by Michel Vidal-Naquet and Shimon Ullman.
In ICCV 2003.
-
Matching Words and Pictures
by Kobus Barnard, Pinar Duygulu, Nando de Freitas, David Forsyth, David Blei, and Michael I. Jordan.
In Journal of Machine Learning Research, 3, 2003.
-
Robust Real-Time Face Detection
by Paul Viola and Michael J. Jones.
In International Journal of Computer Vision 57(2), 2004.
Background Reading:
- Query
by Image and Video Content: The QBIC System (1995) by
M. Flickner, H. S. Sawhney, J. Ashley, Q. Huang, B. Dom, M. Gorkani,
J. Hafner, D. Lee, D. Petkovic, D. Steele and Peter Yanker. In: IEEE Computer,
Vol. 28, No. 9, pp. 23-32.
Other Application of Probabilistic Models
- D. Sculley,
Combined regression and ranking, in Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
(New York, NY, USA: ACM, 2010), 979-988.
- Ralf Herbrich, Tom Minka, and Thore Graepel, TrueSkill: A Bayesian Skill Rating System, NIPS 2006.
- Localizing
Bugs in Program Executions with Graphical Models. Laura Dietz,
Valentin Dallmeier, Andreas Zeller, Tobias Scheffer NIPS 2009
- Probabilistic analysis of a large-scale urban traffic data set.
J. Hutchins, A. Ihler, and P. Smyth.
Second International Workshop on Knowledge Discovery from Sensor Data (ACM SIGKDD Conference, KDD-08), August 2008.
- HiLighter:
Automatically Building Robust Signatures of Performance Behavior
for Small- and Large-Scale Systems, Peter Bodik, Moises
Goldszmidt, Armando Fox. Third Workshop on Tackling Computer
Systems Problems with Machine Learning (SysML '08), San Diego,
December 2009
- Thore Graepel, Joaquin Quinonero Candela, Thomas Borchert, and Ralf Herbrich. Web-scale Bayesian click-through rate prediction for sponsored search advertising in Microsoft's Bing search engine. ICML 2010
-
Learning to detect events with Markov-modulated Poisson
processes. Alexander Ihler, Jon Hutchins, Padhraic Smyth; ACM
Transactions on Knowledge Discovery from Data, Vol 1 Issue 3, Dec.
2007.
Other Applications
- Mitchell et al. Learning
to Decode Cognitive States from Brain Images. Machine Learning Journal, 2004.
- Schmidt, E.M. and Kim, Y.E. Prediction of Time-Varying Musical
Mood Distributions Using Kalman Filtering, International Conference on Machine Learning and Applications,
2010
Ideas for Self-Proposed Papers
Inevitably, there are important application areas of machine learning that are not
as well represented on this list as they could be. If you are interested in these
areas, feel free to propose your own paper. To find a list of papers from a conference,
do a Google search like "CVPR 2010" and look for a link that says "Program", "Proceedings",
"Accepted papers", or some such. Some ideas are
- Computer Vision: Check recent proceedings of CVPR, ICCV
- Natural Language Processing: Main conferences include ACL, NAACL, and EMNLP
- Bioinformatics
This page was written by Frederick Ducatelle and has been updated and
maintained by Charles Sutton, Amos
Storkey and Stefanos Angelidis