DME – Student Presentations

Timetable
Instructions
Papers

We will have paper presentations in the second half of the course. You will need to do two things (see detailed instructions below):

Give a presentation on a paper (2/3 of the presentation grade)
Write a short summary for two presentations per session (1/3 of the presentation grade)

Timetable

Date	Slot	UUN	Student	Paper
02/03	1	s1671956 s1679937	Angelos Eirini	Data Mining for Internet of Things: A Survey
	2	s1686429 s1352385	Richard George M.	Independent Component Analysis: Algorithms and Applications
	3	s1046993 s1676828	Josh Martynas	Random Search for Hyper-Parameter Optimization
	4	s1110577 s1670546	Bartek Santiago	Private traits and attributes are predictable from digital records of human behavior
09/03	1	s1667278 s1675946	Chris Elias	Reducing the Dimensionality of Data with Neural Networks
	2	s1667813 s1669167	Andreas Stavros	Isolation forest {for anomaly detection}
	3	s1212549 s1672197	Patrick Giannis	Meme-tracking and the dynamics of the news cycle
16/03	1	s1676895 s1687131	Cristina Alex	Visualizing Data using t-SNE
	2	s1652217 s1604115	Abhinav Ankush	Probabilistic Principal Component Analysis
	3	s1606815 s1311707	Peter Manni	On a Connection between Kernel PCA and Metric Multidimensional Scaling
	4	s1142146	Aleksander	Theory-guided Data Science: A New Paradigm for Scientific Discovery
23/03	1	s1680879 s1671145	Manolis Lefteris	"Why Should I Trust You?" Explaining the Predictions of Any Classifier
	2	s1672054 s1685264	Dimitris Angeliki	BLEU: a method for automatic evaluation of machine translation
30/03	1	s1687053 s1687568	Christos George P.	Matrix Factorization Techniques for Recommender Systems
	2	s1631755 s1666415	Christine Aisling	Recommender Systems: Missing Data and Statistical Model Estimation
	3	s1670236 s1682581	Wei Guang Siyuan	Practical Bayesian Optimization of Machine Learning Algorithms
	4	s1687620 s1686308	Lasse Antonio	The Self-Organizing Map

Instructions

Presentations
The papers are presented by groups of two students. Each group should email the TA the following information by Friday 10 February:

Names and student numbers
3 papers in decreasing order of preference
Preferred date of the presentation

Please note that we cannot guarantee that we can accommodate everyone's preferences. You can use piazza to find team-mates. If you would like to do the presentations alone rather than in a group, please check with the TA. Please note that individual presentations will only be possible if time slots are available.

If the presentation is given by two students, both have to contribute equally to the presentation. Both presenters will receive the same grade.
The presentations last 20 minutes. This will be strictly enforced.
After each presentation we will have 5 minutes of questions.
We will typically have 2 presentations per 50 min lecture slot.
You should prepare slides and send them as a pdf to the lecturer on the day before your presentation. If you prefer another file format for your presentation, please check that the slides project correctly after one of the earlier sessions.
Research papers typically make a scientific contribution, which means that they propose or claim something that holds and that matters. The overall goal of the presentation is to convey the contribution made in the paper. For that purpose, the presentations should cover:
1. Very briefly, what is the paper generally about?
2. Background and/or brief recap of the relevant material from the lecture
3. What is proposed or claimed in the paper?
4. What supporting evidence is provided?
5. Why does the proposal/claim matter?
Do only include as much mathematics as needed to convey the key message of the paper.
Feel free to use diagrams and equations from the paper in your slides (with proper acknowledgement).

Summary

The summary should be structured according to the five highlighted points above.
In total, the summary of each paper should be maximally half a page. This means one to two sentences per point only. Good diagrams will be helpful.
Handwritten and scanned documents are ok if legible.
Please email the summary of two papers per session to the lecturer by Tuesday of the week following the presentation (at noon).

Papers

Please feel free to propose papers yourself. Check with the lecturer about suitability.

PCA and its extensions

Probabilistic Principal Component Analysis
M. Tipping and C. Bishop
Journal of the Royal Statistical Society 1999
Independent Component Analysis: Algorithms and Applications
A. Hyvarinen
Neural Networks 2000
Robust Principal Component Analysis?
E. Candes, X. Li, Y. Ma, and J. Wright
Journal of ACM 2009

Dimensionality reduction and data visualisation

Reducing the Dimensionality of Data with Neural Networks
G. Hinton and R. Salakhutdinov
Science 2006
On a Connection between Kernel PCA and Metric Multidimensional Scaling
C. Williams
NIPS 2001
Hubs in Space: Popular Nearest Neighbors in High-Dimensional Data
M. Radovanovic et al
Journal of Machine Learning Research 2010
Nonlinear Dimensionality Reduction by Locally Linear Embedding ( longer version )
L. Saul and S. Roweis
Science 2000
Visualizing Data using t-SNE
L. van der Maaten and G. Hinton
Journal of Machine Learning Research 2008
The Self-Organizing Map
T. Kohononen
Neurocomputing 1998

Performance evaluation, hyperparameter selection

Image Quality Assessment: From Error Visibility to Structural Similarity
Z. Wang, A. Bovik, et al
IEEE Transactions on Image Processing 2004
BLEU: a Method for Automatic Evaluation of Machine Translation
K. Papineni, S. Roukos, et al
Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL) 2002
Random Search for Hyper-Parameter Optimization
J. Bergstra and Y. Bengio
Journal of Machine Learning Research
Practical Bayesian Optimization of Machine Learning Algorithms
J. Snoek, H. Larochelle, and R. Adams
NIPS 2012
"Why Should I Trust You?" Explaining the Predictions of Any Classifier
M.T. Ribeiro, S. Singh, and C. Guestrin
Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Missing data, outliers, and anomaly detection

Isolation forest {for anomaly detection}
Liu et al
Eighth IEEE International Conference on Data Mining 2008
Removing Electroencephalographic Artifacts: Comparison between ICA and PCA
T.P. Jung et al
Proceedings of the 1998 IEEE Signal Processing Society Workshop 1998
Recommender Systems: Missing Data and Statistical Model Estimation
B. Marlin et al
Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence (IJCAI 2011)
LOF: Identifying Density-Based Local Outliers
M. Breunig et al
Proceedings of the ACM SIGMOD International Conference on Management of Data 2000
Support Vector Data Description
D. Tax and R. Duin
Machine Learning 2004
Anomaly Detection: A Survey
V. Chandola, A. Banerjee, and V. Kumar ACM Computing Surveys (CSUR), 2009

Miscellaneous

Private traits and attributes are predictable from digital records of human behavior
M. Kosinski et al
Proceedings of the National Academy of Sciences 2013
Learning Fair Representations
R. Zemel et al
Proceedings of the 30th International Conference on Machine Learning 2013
Data Mining for Internet of Things: A Survey
C.-W. Tsai, C.-F. Lai, and M.-C. Chiang
IEEE Communications Surveys & Tutorials, 16(1) 2014
Hidden Technical Debt in Machine Learning Systems
D. Sculley et al
Advances in Neural Information Processing Systems 28, 2015
Meme-tracking and the dynamics of the news cycle
J. Leskovec, L. Backstrom, and J. Kleinberg
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining 2009
Theory-guided Data Science: A New Paradigm for Scientific Discovery
A. Karpatne et al
(paper under review)
Matrix Factorization Techniques for Recommender Systems
Y. Koren, R. Bell, and C. Volinsky
Computer, 42(8), 2009

DME – Student Presentations

Table of Contents

Timetable

Instructions

Papers