Practical info

This page contains information about the course groups as well as information on mandatory assignments and exam project.

Groups

On the first day we will assign you to a group. The assignment to groups is strictly random. The groups is for you to meet new people as this course has a broad group of students. The groups will be used for handing in the mandatory assignments and the final exam projects, see below. If you are too few people in the group we will reassign you to another group.

Mandatory assignments

There will be two mandatory assignments to be handed as groups.

  • Assignment 1: collecting and structuring data.
    • The exercise will be posted Thursday, August 16, in the afternoon.
    • The exercise must be handed in no later than 23:59, Sunday, August 19, 2018.
  • Assignment 2: machine learning and text data.
    • The exercise will be posted Wednesday, August 22, in the afternoon.
    • The exercise must be handed in no later than 23:59, Friday, August 24, 2018.

The assignments will be graded using peer grading software. More information will follow.

Exam project

At the end of the course your group must hand in a independent exam project. The official exam period runs from August 25, 2018 from 10 a.m. to September 1 at 10 a.m.

The content of the exam project is something that you choose. You and your group must find a subject, data, choose methods etc.

Grading

The grade for this course is exclusively determined by the project handed in. The project will be judged on a number of dimensions, these include:

  • how the data was obtained (setting up new data collection);
  • how the data was processed;
  • how machine learning methods are applied and which methods are used;
  • how results are explained (writing, figures, tables with model output etc.);
  • the research question and its originality and how it is answered.

Some advice about the grading. It is essential that spend time on motivating your project and conveying your results. In addition, it is important that you spend time on calibrating and validating the models you work with rather than using as many models as possible.

Requirements for project

The exam projects have a number of requirements that must be met, these are: requirement

  • Research question (you should discuss with lecturers and TAs)
  • Groups with three to four members
  • Project formalia
    • Project must consist of a report (.pdf file) and a documentation as Jupyter Notebook (.ipynb file).
    • The report should be written like a brief research article (short literature review, references to methods, results etc.). The report is limited to the following maximum number of pages (normalsider).
    • 2 members, 16 pages;
    • 3 members, 20 pages;
    • 4 members, 24 pages.
    • Note that 1 page (normalside) corresponds 2,400 and does not count figures, abstract, list of reference, frontpage, appendix.
    • The report should contain your exam numbers and possibly your names (optional). The exam numbers (or names) MUST show who contributed with writing which parts of the report. At most 20 pct. of the report can be written shared. If you fail to provide this the submission of your project may get rejected!
    • Grading will be based on the report but process but data collection, computations etc. should be well documented in the supporting Jupyter Notebook.

Possible data sources

Students in previous years of Social Data Science have used a large variety of data sources including:

  • news on DR (Danish Broadcasting Company) and the Danish newspaper Information
  • price of cars for sales on bilbasen
  • analyzing linguistic content on Twitter
  • Airbnb pricing in Copenhagen
  • Prediction of bitcoin prices from Reddit data.

If you are interested in working with one or more of these datasets or see the projects by the students who made them please contact us and we will put you in touch.