Scientific Programming Lab

Data Science Master @University of Trento - AA 2019/20

Download:    PDF    EPUB    HTML

Teaching assistant: David Leoni david.leoni@unitn.it website: davidleoni.it

This work is licensed under a Creative Commons Attribution 4.0 License CC-BY

cc-by jiu99

News

17 June 2020 - Pulished 2020-06-16 exam results

4 March 2020 - Pulished 2020-02-10 exam results

31 January 2020 - Published 2020-01-23 exam results

7 January 2020 Extra tutoring:

(Beware rooms are not always the same)

  • Tue 14 January 10.00 - 12.00 A216

  • Wed 15 January 10.00 - 12.00 A216

  • Thu 16 January 10.00 - 12.00 A214

  • Fri 17 January 10.00 - 12.00 A221

  • Tue 21 January 10.00 - 12.00 A216

23 December 2019 - Published Midterm B grades:

07 December 2019 - Set midterm Part B date:

  • Friday 20th December, lab A202, from 11.45 to 13.45

  • Admission: students who got grade >= 16 at the first midterm

06 December 2019: Published midterm results:

28 November 2019: Set exams dates:

  • 23 January 8:30-13:30 A201

  • 10 February 8:30-13:30 A202

7 November 2019: published Midterm Part A solution

Old news

Slides

See Slides page

Office hours

To schedule a meeting, see here

Labs timetable

For the regular labs timetable please see:

Tutoring

A tutoring service for Scientific Programming - Data science labs has been set up and will be held by Gabriele Masina - email: gabriele.masina (guess what) studenti.unitn.it

Please take advantage of it as much as possible so you don’t end up writing random code at the exam!

  • Mondays: room A215 from 11.30-13.30 (note: it will be until 13:30 and not 14:30 as previously said in class)

  • Wednesday: 9:00-11:00, Rooms: A219 until Wednesday 13 November included, A218 afterwards

Complete tutoring schedule:

November 2019:

  • 4 monday 11.30-13.30 A218

  • 6 wednesday 9.00-11:00

  • 11 monday 11.30-13.30 A218

  • 13 wednesday 9.00-11:00

  • 18 monday 11.30-13.30 A218

  • 20 wednesday 9.00-11:00 A218

  • 25 monday 11.30-13.30 A218

  • 27 wednesday 9.00-11:00 A218

December 2019:

  • 2 monday 11.30-13.30 A218

  • 4 wednesday 9.00-11:00 A218

  • 9 monday 11.30-13.30 A218

  • 11 wednesday 9.00-11:00 A218

  • 16 monday 11.30-13.30 A218

  • 18 wednesday 9.00-11:00 A218

January 2020:

(Beware rooms are not always the same)

  • Tue 14 January 10.00 - 12.00 A216

  • Wed 15 January 10.00 - 12.00 A216

  • Thu 16 January 10.00 - 12.00 A214

  • Fri 17 January 10.00 - 12.00 A221

  • Tue 21 January 10.00 - 12.00 A216

Exams

Schedule

Exams dates:

  • 23 January 8:30-11:30 A201

  • 10 February 8:30-11:30 A202

Exam modalities

Sciprog exams are open book. You can bring a printed version of the material listed below.

Exam will take place in the lab with no internet access. You will only be able to access this documentation:

So if you need to look up some Python function, please start today learning how to search documentation on Python website.

Make practice with the lab computers !!

Exam will be in Linux Ubuntu environment - so learn how to browse folders there and also typing with noisy lab keyboards :-)

Expectations

This is a data science master, so you must learn to be a proficient programmer - no matter the background you have.

Exercises proposed during labs are an example of what you will get during the exam, BUT there is no way you can learn the required level of programming only doing exercises on this website. Fortunately, since Python is so trendy nowadays there are a zillion good resources to hone your skills - you can find some in Resources

To successfully pass the exam, you should be able to quickly solve exercises proposed during labs with difficulty ranging from ✪ to ✪✪✪ stars. By quickly I mean in half on hour you should be able to solve a three star exercise ✪✪✪. Typically, an exercise will be divided in two parts, the first easy ✪✪ to introduce you to the concept and the second more difficult ✪✪✪ to see if you really grasped the idea.

Before getting scared, keep in mind I’m most interested in your capability to understand the problem and find your way to the solution. In real life, junior programmers are often given by senior colleagues functions to implement based on specifications and possibly tests to make sure what they are implementing meets the specifications. Also, programmers copy code all of the time. This is why during the exam I give you tests for the functions to implement so you can quickly spot errors, and also let you use the course material (see exam modalities).

Part A expectations: performance does not matters: if you are able to run the required algorithm on your computer and the tests pass, it should be fine. Just be careful when given a 100Mb file, in that case sometimes bad code may lead to very slow execution and/or clog the memory.

In particular, in lab computers the whole system can even hang, so watch out for errors such as:

  • infinite while which keeps adding new elements to lists - whenever possible, prefer for loops

  • scanning a big pandas dataframe using a for in instead of pandas native transformations

Part B expectations: performance does matters (i.e. finding the diagonal of a matrix should take a time linearly proportional to \(n\), not \(n^2\)). Also, in this part we will deal with more complex datastructures. Here we generally follow the Do It Yourself method, reimplementing things from scratch. So please, use the brain:

  • if the exercise is about sorting, do not call python .sort() method !!!

  • if the exercise is about data structures, and you are thinking about converting the whole data structure (or part of it) into python lists, first, think about the computational cost of such conversion, and second, do ask the instructor for permission.

Grading

When all test pass hopefully should get full grade (although tests are never exhaustive!), but if the code is not correct you will still get a percentage. Percentage of course is subjective, and may depend on unfathomable factors such as the quantity of jam I found in the morning croissant that particular day. Jokes aside, the amount you get is usually proportional to the amount of time I have to spend to fix your algorithm.

After exams I will send you back your code with corrections. If all tests pass and you still don’t get 100% grade, you may come to my office questioning the grade. If tests don’t pass I’m less available for debating - I don’t like much complaints like ‘my colleague did the same error as me and got more points’ - even worse is complaining without having read the corrections.

Past exams

See Past exams page

Resources

Google colabs: Scratchpads to show python code. During the lesson you can also write on them to share code.

Source code of these worksheets (download zip), in Jupyter Notebook format.

Part A Resources

Part A Theory slides by Andrea Passerini

Allen Downey, Think Python

License: Creative Commons CC BY Non Commercial 3.0as reported in the original page

Tutorials from Nicola Cassetta

  • Tutorial step by step, in Italian, good for beginners. They are well done and with solutions - please try them all.

  • online

Dive into Python 3

Licence: Creative Commons By Share-alike 3.0 come riportato in fondo al sito del libro

LeetCode

Website with collections of exercises sorted by difficulty and acceptance rate. You can generally try sorting by Acceptance and Easy filters.

leetcode.com

For a selection of exercises from leetcode, see Further resources sections at the ends of

HackerRank

Contains many Python 3 exercises on algorithms and data structures (Needs to login)

hackerrank.com

Material from other courses of mine (in Italian)

Part B Resources

Editors

  • Visual Studio Code: the course official editor.

  • Spyder: Seems like a fine and simple editor

  • PyCharme Community Edition

  • Jupyter Notebook: Nice environment to execute Python commands and display results like graphs. Allows to include documentation in Markdown format

  • JupyterLab : next and much better version of Jupyter, although as of Sept 2018 is still in beta

  • PythonTutor, a visual virtual machine (very useful! can also be found in examples inside the book!)

Further readings

  • Rule based design by Lex Wedemeijer, Stef Joosten, Jaap van der woude: a very readable text on how to represent information using only binary relations with boolean matrices (not mandatory read, it only gives context and practical applications for some of the material on graphs presented during the course)

Acknoledgements

  • I wish to thank Dr. Luca Bianco for the introductory material on Visual Studio Code and Python

  • This site was made with Jupyter using NBSphinx extension and Jupman template