Papers for Presentation

Here is a list of suggested papers for the paper presentation. You may choose a paper not on the list, but you must consult with me over email about it. If there is a topic you would like to know more about, but there isn't a relevant paper on this, you may also email me, and I'll try to help you find a paper that would be appropriate.


Automated Recommender Systems

  1. Yehuda Koren, Factorization meets the neighborhood: a multifaceted collaborative filtering model, in Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining (Las Vegas, Nevada, USA: ACM, 2008), 426-434.
  2. Robert Bell, Yehuda Koren, and Chris Volinsky, Modeling relationships at multiple scales to improve accuracy of large recommender systems, in Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining (San Jose, California, USA: ACM, 2007), 95-104. Bell and Koren are two of the authors of the prize-winning Netflix system.
  3. Ruslan Salakhutdinov, Andriy Mnih, and Geoffrey Hinton, Restricted Boltzmann machines for collaborative filtering, in Proceedings of the 24th international conference on Machine learning (Corvalis, Oregon: ACM, 2007), 791-798.
  4. Nikolay Archak, Anindya Ghose, and Panagiotis G. Ipeirotis, Show me the money!: deriving the pricing power of product features by mining consumer reviews, in Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining (San Jose, California, USA: ACM, 2007), 56-65.

Document Clustering, Classification and Analysis

  1. David M. Blei and John D. Lafferty, Dynamic topic models, in Proceedings of the 23rd International Conference on Machine learning (Pittsburgh, Pennsylvania: ACM, 2006), 113-120.
  2. Xiaohua Hu et al., Exploiting Wikipedia as external knowledge for document clustering, in Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining (Paris, France: ACM, 2009), 389-396.
  3. Prem Melville, Wojciech Gryc, and Richard D. Lawrence, Sentiment analysis of blogs by combining lexical knowledge with text classification, in Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining (Paris, France: ACM, 2009), 1275-1284.
  4. Jun Zhu et al., Simultaneous record detection and attribute labeling in web data extraction, in Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining (Philadelphia, PA, USA: ACM, 2006), 494-503.
  5. Xuerui Wang and Andrew McCallum, Topics over time: a non-Markov continuous-time model of topical trends, in Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining (Philadelphia, PA, USA: ACM, 2006), 424-433.
  6. Quanquan Gu and Jie Zhou, Co-clustering on manifolds, in Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining (Paris, France: ACM, 2009), 359-368.

Making and combining clasifiers

  1. Mark Dredze, Koby Crammer, and Fernando Pereira, Confidence-weighted linear classification, in Proceedings of the 25th international conference on Machine learning (Helsinki, Finland: ACM, 2008), 264-271, .
  2. Xiaopeng Xi et al., Fast time series classification using numerosity reduction, in Proceedings of the 23rd international conference on Machine learning (Pittsburgh, Pennsylvania: ACM, 2006), 1033-1040, .
  3. Nicolò Cesa-Bianchi, Claudio Gentile, and Luca Zaniboni, Hierarchical classification: combining Bayes with SVM, in Proceedings of the 23rd international conference on Machine learning (Pittsburgh, Pennsylvania: ACM, 2006), 177-184.
  4. Rajat Raina et al., Self-taught learning: transfer learning from unlabeled data, in Proceedings of the 24th international conference on Machine learning (Corvalis, Oregon: ACM, 2007), 759-766.
  5. Xiao Ling et al., Spectral domain-transfer learning, in Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining (Las Vegas, Nevada, USA: ACM, 2008), 488-496.
  6. Thorsten Joachims, Training linear SVMs in linear time, in Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining (Philadelphia, PA, USA: ACM, 2006), 217-226.
  7. Yonatan Amit et al., Uncovering shared structures in multiclass classification, in Proceedings of the 24th international conference on Machine learning (Corvalis, Oregon: ACM, 2007), 17-24.

Miscellaneous

  1. Yoshua Bengio et al., Curriculum learning, in Proceedings of the 26th Annual International Conference on Machine Learning (Montreal, Quebec, Canada: ACM, 2009), 41-48.
  2. Charu C. Aggarwal et al., Frequent pattern mining with uncertain data, in Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining (Paris, France: ACM, 2009), 29-38.
  3. Christos Boutsidis, Michael W. Mahoney, and Petros Drineas, Unsupervised feature selection for principal components analysis, in Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining (Las Vegas, Nevada, USA: ACM, 2008), 61-69.
  4. Justin Ma et al., Beyond blacklists: learning to detect malicious web sites from suspicious URLs, in Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining (Paris, France: ACM, 2009).

Pattern Discovery

  1. Fosca Giannotti et al., Trajectory pattern mining, in Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining (San Jose, California, USA: ACM, 2007), 330-339.
  2. Detecting Group Differences: Mining Contrast Sets (2001) by S. D. Bay and M. J. Pazzani. In: Data Mining and Knowledge Discovery.

Scalable algorithms

  1. Rajat Raina, Anand Madhavan, and Andrew Y. Ng, Large-scale deep unsupervised learning using graphics processors, in Proceedings of the 26th Annual International Conference on Machine Learning (Montreal, Quebec, Canada: ACM, 2009), 873-880, http://portal.acm.org/citation.cfm?id=1553374.1553486&coll=GUIDE&dl=GUIDE&CFID=68267393&CFTOKEN=24515362.

Social networks

  1. Lars Backstrom et al., Group formation in large social networks: membership, growth, and evolution, in Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining (Philadelphia, PA, USA: ACM, 2006), 44-54.
  2. Jure Leskovec et al., Microscopic evolution of social networks, in Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining (Las Vegas, Nevada, USA: ACM, 2008), 462-470.
  3. Yun Chi et al., Structural and temporal analysis of the blogosphere through community factorization, in Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining (San Jose, California, USA: ACM, 2007), 163-172.
  4. Mary McGlohon, Leman Akoglu, and Christos Faloutsos, Weighted graphs and disconnected components: patterns and a generator, in Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining (Las Vegas, Nevada, USA: ACM, 2008), 524-532.

Web mining

  1. Huanhuan Cao et al., Context-aware query suggestion by mining click-through and session data, in Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining (Las Vegas, Nevada, USA: ACM, 2008), 875-883.
  2. Ricardo Baeza-Yates and Alessandro Tiberi, Extracting semantic relations from query logs, in Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining (San Jose, California, USA: ACM, 2007), 76-85.
  3. Fei Wu, Raphael Hoffmann, and Daniel S. Weld, Information extraction from Wikipedia: moving down the long tail, in Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining (Las Vegas, Nevada, USA: ACM, 2008), 731-739.
  4. Filip Radlinski, Robert Kleinberg, and Thorsten Joachims, Learning diverse rankings with multi-armed bandits, in Proceedings of the 25th international conference on Machine learning (Helsinki, Finland: ACM, 2008), 784-791.

Document Clustering, Classification and Analysis

  1. Text Classification from Labeled and Unlabeled Documents using EM (1999) by K. Nigam, A. K. McCallum, S. Thrun and T. Mitchell. In: Machine Learning, Volume 39, Issue 2/3, pp. 103-134.
  2. Shallow Parsing with Conditional Random Fields. Fei Sha and Fernando Pereira. Proceedings of Human Language Technology-NAACL 2003
  3. A Probabilistic Framework for Semi-Supervised Clustering by Sugato Basu, Mikhail Bilenko and Raymond J. Mooney. In KDD 2004.

Web Mining (and other text analysis)

  1. Authoritative sources in a Hyperlinked Environment (1998) by J. Kleinberg. In: Journal of the ACM, Volume 46. This is a classic paper. Additional analysis re stability can be found in Link analysis, eigenvectors, and stability (2001) by A. Y. Ng, A. X. Zheng, and M. I. Jordan. In International Joint Conference on Artificial Intelligence (IJCAI)
  2. Tracking evolving communities in large linked networks by John Hopcroft and Omar Khan and Brian Kulis and Bart Selman. In PNAS 101 suppl. 1, 2004.
  3. The PageRank Citation Ranking: Bringing Order to the Web (1998) by L. Page, S. Brin, R. Motwani and T. Winograd. Technical Report, Stanford University.

Marketing Applications

  1. Mining the Network Value of Customers (2001) by P. Domingos and M. Richardson. In: Proceedings of KDD-2001: the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 57-66.
  2. Maximizing the Spread of Influence through a Social Network by David Kempe and Jon Kleinberg and Eva Tardos. In KDD 2003

Bioinformatics

  1. A Hierarchical Bayesian Markovian Model for Motifs in Biopolymer Sequences by E.P. Xing, M.I. Jordan, R.M. Karp and S. Russell. In Advances in Neural Information Processing Systems 15 ( NIPS2002). A longer version is available from Eric's home page.

Machine Vision

  1. Semi-Supervised Learning in Gigantic Image Collections Rob Fergus, Yair Weiss, Antonio Torralba
  2. Segmenting Scenes by Matching Image Composites Bryan Russell, Alyosha Efros, Josef Sivic, Bill Freeman, Andrew Zisserman
  3. Beyond Categories: The Visual Memex Model for Reasoning About Object Relationships Tomasz Malisiewicz, Alyosha Efros

Other Applications

  1. Localizing Bugs in Program Executions with Graphical Models. Laura Dietz, Valentin Dallmeier, Andreas Zeller, Tobias Scheffer NIPS 2009
  2. HiLighter: Automatically Building Robust Signatures of Performance Behavior for Small- and Large-Scale Systems, Peter Bodik, Moises Goldszmidt, Armando Fox. Third Workshop on Tackling Computer Systems Problems with Machine Learning (SysML '08), San Diego, December 2009
  3. Learning to detect events with Markov-modulated Poisson processes. Alexander Ihler, Jon Hutchins, Padhraic Smyth; ACM Transactions on Knowledge Discovery from Data, Vol 1 Issue 3, Dec. 2007.
  4. Adaptive Fraud Detection by Tom Fawcett, Foster Provost. In Data Mining and Knowledge Discovery, 1(3) 1-28 (1997).
  5. Multiple regimes in northern hemisphere height fields via mixture model clustering. by P. Smyth, K. Ide and M. Ghil. In Journal of the Atmospheric Sciences, 56(21), 3704-3723.
  6. Discovery of Climate Indices using Clustering by Michael Steinbach and Steven Klooster and Christopher Potter. In KDD 2003.

This page was written by Frederick Ducatelle and is updated and maintained by Charles Sutton


Home : Teaching : Courses : Dme 

Informatics Forum, 10 Crichton Street, Edinburgh, EH8 9AB, Scotland, UK
Tel: +44 131 651 5661, Fax: +44 131 651 1426, E-mail: school-office@inf.ed.ac.uk
Please contact our webadmin with any comments or corrections. Logging and Cookies
Unless explicitly stated otherwise, all material is copyright © The University of Edinburgh