Project submission instructions

Final submission deadline: November 25, 4pm.

The submission should include:

  1. A report. 4pages max. Optionally with appendix that will not be marked.
  2. The code you have implemented, along with scripts that run the code and plot the results.
  3. Example of running the different experiments to get the plots. This should be a pdf that shows the commands and the corresponding plots for a small dataset that can be processed easily. If you are using ipython notebook, simply printing the notebook after creating all the plots should do the trick.
  4. (Optionally) The ipython notebook if you are using ipython.
  5. The dataset. The actual data may be too large to submit, in which case put your data in a publicly readable folder and make sure your program reads from there.

More details on these elements are given below.

How to submit.

On the DICE system, put your files in a folder called

stn_assignment_<your uun>. 

The report, the pdf showing examples etc should be in the top level folder, while the program files, scripts etc can be in a subfolder.

The submission command is:

submit stn 1 stn_assignment_<your uun>

This folder should not contain large files like large datasets.

Parts of the submission

Report

The report should be a 3-4 pages pdf document. Shorter and concise is better! Make sure to describe everything in your own words. Remember to put the title of your project and mention your team. The report should include:

  1. Problem statement. Clearly define what is the problem you were studying. Explain the problem and its importance. Include a mathematical formulation if possible. Keep in mind that the description should be readable by anyone without specialized knowledge.

  2. Related work. Mention any relevant papers and state what problem they are solving and the general approach. This should be really short for each paper, just a couple of sentences. You do not need to explain their works in detail. Explain how your works relates to these. Is it better in some way? Does it solve a different problem from all of them? Why is your work important compared to these?

  3. Your solution. Describe in details your idea for solving the problem. Show why it is different from previous work where possible. Explain why the problem is challenging, explain what your ideas are and justify them. Time may not permit you do everything, in which case explain what you are presenting here and what can be done later.

  4. Results. Describe your results: algorithms, proofs, general arguments and analysis. Give plots, tables that show your results. Discuss them. You need to explain the results and their importance to us. We were not there during the work, so we cannot understand things that you do not explain. Discuss what else can be done on this topic etc.

Appendix. You can attach an appendix of up to 3 pages with any additional plots, results and proofs that might be important. This can include results from your teammate that may be relevant to your discussions. You will not be marked on the appendix, these are reference materials that may help you explain your points better. You can cite material in the appendix from your main report, but keep in mind that we may not read it. So the main report has to be understandable on its own.

The code

This should be readable and well commented. The generally good way to structure it is to write code for important elements: classes, functions etc, and then to call them from different scripts, passing suitable arguments like data file, parameter values etc. ipython notebook may be good for this.

Examples

We would like to sometimes run your algorithm to see that it works. So if possible, provide a guideline pdf. This can simply consist of what command to run to which plot. And any comment you would like to include. These examples should be in terms of some small datasets, for example, those you use for testing while writing code.

Including an ipython notebook can be the easy thing to do if you are using ipython.

This should also contain clear instructions on how to run your code on a different dataset of needed.

Ideally, your code should run on DICE. (you don’t have to work on it, just try to make sure the code runs.) If it is not possible to run it on DICE, please provide clear instructions on how we may be able to run it on other computers.

Dataset

Submitting the dataset may be impractical if it is larger than a few megabytes. Please put it in a folder and make the folder readable by all on DICE. This can be done by running the following command from the parent directory:

chmod -R a+r <folder_name>

In your submitted code and example codes above, use the absolute full path to the folder on dice. You can find the path to any directory by running

pwd

From inside the directory. This way, we will be able to run your code on the data without you having to submit the data. Test that this is working before submitting.

The dataset can also contain the small sample datasets for the examples above, and any cleaned/modified.reformatted versions of data you are using.

If your main dataset is too large for storing on your account, please provide the link to the original dataset, or put it on a cloud service and provide that link so that we can download.

Additional tips: Project, Writing, group work etc

You will be marked on trying new ideas, justifying them and explaining clearly. Do not worry about the idea not working well, but make sure to discuss it properly.

Group work: Everyone should submit their own report. Some parts, such as problem description and related can be similar, but written in your own words and with reference to the parts you are submitting. The rest is expected to be your own. This can be achieved by different team members being responsible for different parts. Eg. different members working on slightly different ideas, or different aspects of the problem. If the project has significant theoretical and experimental sides, it may be reasonable for one person to do each. Your submission is your responsibility; make sure from the start that you have a secure plan and will be able to submit something for your part.

You are free to cite your teammate’s work in your discussion to explain your work and point out interesting comparisons. You can also include some of their plots, results etc in your appendix if it makes easier to write, but you must state that these are your teammate’s work.

Writing: The most important thing is that your writing is clear and easily understandable. Remember that the marker does not know what you are thinking and does not know all the techniques/concepts you may be using. So you need to explain anything you use. But this explanation needs to be concise – say only what is needed to understand your work.

After you write, read it and make sure it is readable and understandable to others. If we cannot follow your report, we cannot give you marks.

Do not keep writing for the last day. Start on it early.

Project management. In doing the project you may find that things do not always work out the way you expected. Some things may take more time, some things may not work at all. You may find yourself short of time. Do not panic! A common mistake is to stick too rigidly to the plan. Think what you can do instead of the original plan.

May be you can run your program on a smaller dataset. May be you can simplify the algorithm so that it does not do exactly the same thing or does not work on all types of data, but does work on some types of data. May be it is solving a slightly different problem from the original one. May be the reason it does not work is an interesting observation in itself.

Instead of worrying that you may not be able to do it, think what you can do. Often, interesting things crop up if you spend a bit of thought. If you keep notes of ideas/possibilities from start, some of them can be useful in a crunch. Your final goal is to submit an interesting report.

Comparisons. A useful way to make your report interesting is comparisons. A plot or result probably cannot show your idea to be good or bad on its own. Good or bad are relative – you need something to compare with. This “something” may be an existing method; may be a simple, naive way of solving the problem (brute force, random, results known from other sources etc) or may be comparison between two different ideas you have, or may be comparison between results on two different types of networks/data.

The point is that a comparison gives you things to discuss and explain (what is the difference, why it exists etc), and thus make your work interesting. So think about what comparisons you can show.

Last updated on Nov 15.