Course syllabus

Course-PM

Python for Data Scientists, study period 1 HT22 (7.5 hp)

Course is offered by the Department of Computer Science and Engineering

Contact details

  • Examiner and course responsible: Selpi (selpi at chalmers.se)

Administration

Course purpose

This course is a combination of a continuation course in programming, object oriented programming, data structures, foremost from the perspective of data science, including a short orientation about algorithms and algorithm design principles. The programming language in this course is Python, which is the most common language in the area of data science.

Course content:

  • Basics of Python (data types, expressions, control structures)
  • Types of algorithms, searching and sorting
  • Object oriented programming
  • Common data structures
  • Standard libraries relevant to data science
  • Orientation about algorithms and algorithm design principles

 

Schedule

Link to TimeEdit.

 

Course literature (in no particular order)

[1] Allen B. Downey, Think Python: How to Think Like a Computer Scientist, 2nd edition. Green Tree Press, 2015.

[2] Jake VanderPlas. Python Data Science Handbook, O’Reilly Media, Inc., 2016. 

[3] Jake VanderPlas. A Whirlwind Tour of Python, O’Reilly Media, Inc., 2016.

[4] Python tutorial: https://docs.python.org/3/tutorial/

[5] Python 3 course: http://www.python-course.eu/python3_course.php

[6] w3schools, Python: https://www.w3schools.com/python/default.asp

[7] NumPy & SciPy references: http://docs.scipy.org/doc/

[8] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein, Introduction to Algorithms, 3rd Edition. MIT Press and McGraw-Hill, 2009. 

[9] Jon Kleinberg, Eva Tardos: Algorithm Design. Pearson/Addison-Wesley 2006, ISBN 0-321-29535-8.

 

If you have access to the followings, they are also good:

[1] C. Horstmann: Python for everyone 3rd ed., ISBN: 978-1-119-63829-2

[2] W. Mckinney: Python for Data Analysis, 2nd Edition. ISBN: 9781491957660

[3] Sanjoy Dasgupta, Christos Papadimitriou, and Umesh Vazirani. Algorithms (1st Edition). McGraw-Hill Higher Education, 2008.

[4] Robert Sedgewick and Kevin Wayne (2011). Algorithms, 4th edition. The book has a good webpage with additional information, code, exercises, etc: https://algs4.cs.princeton.edu/

 

Course design

There will be lectures and programming assignments (individual assignments and assignments to be done in groups). 

 

Changes made since the last occasion

  • The grading scale changes from "Pass with Distinction (VG), Pass (G) and Fail (U)" to "Pass (G) and Fail (U)".

 

Learning objectives and syllabus

Learning objectives:

Knowledge and understanding

  • explain the basics about classes and objects;
  • explain some basic abstract data types and data structures, including lists, queues,
    hash tables, trees and graphs;
  • explain some of the algorithms used to manipulate and query these data structures
    in an efficient way, for example for sorting and searching, and being able to use the
    respective standard libraries in Python.

 

Competence and skills

  • make efficient use of predefined data structures in Python;
  • construct simple programs using classes and objects;
  • use a standard library of data structures and algorithms in Python for solving tasks
    within the area of data science.

 

Judgement and approach

  • compare and value different aspects of program structures;
  • analyse the efficiency of different algorithms, for example searching and sorting
    algorithms;
  • make informed choices between different data structures and algorithms for
    different applications, in particual those relevant for data science. 

 

Link to the syllabus: https://kursplaner.gu.se/pdf/kurs/en/DIT375

 

Examination form

Grading scale: Pass (G) and Fail (U).

For a student to get a Pass (G) for the entire course, the student has to pass each of the assignments.

Note: If a student didn't pass an assignment, the student needs to re-submit that assignment as soon as possible, but not later than 20 October 2022 (except for the last two assignments). In general, the deadline for a re-submission of an assignment is one week after the grade of that assignment has been released.