This repository contains
- instructions to setup a minimal python environment and
- Jupyter notebooks to introduce the basic concepts of python
- Quizzes to test the understanding of said concepts
Prerequisites
- access to a computing environment with installation rights
- programming experience, i.e. familiarity with basic concepts like control flow and data structures
+--Motivation Installation Instruction Programming Environments +--Data Types Control Flow Modules +--package NumPy +--package SciPy +--package scikit-learn +--package matplotlib (Python plotting, object-oriented) +--package pandas (Python Data Analysis Library)
There are currently two major versions of Python. The older Python2 and the newer Python3. We use Python3, where the latest stable release is 3.6.0 (as of 12 Mar 2017).
are called modules in python
It provides access to the mathematical functions defined by the C standard.
This module provides access to some variables used or maintained by the interpreter and to functions that interact strongly with the interpreter. It is always available.
This module provides a standard interface to extract, format and print stack traces of Python programs. It exactly mimics the behavior of the Python interpreter when it prints a stack trace.
This module defines an object type which can compactly represent an array of basic values: characters, integers, floating point numbers. Arrays are sequence types and behave very much like lists, except that the type of objects stored in them is constrained.
This module provides various time-related functions.
This module provides a portable way of using operating system dependent functionality.
This module implements specialized container datatypes providing alternatives to Python’s general purpose built-in containers, dict, list, set, and tuple.
This module implements a number of iterator building blocks inspired by constructs from APL, Haskell, and SML. Each has been recast in a form suitable for Python.
multiprocessing is a package that supports spawning processes using an API similar to the threading module.
fast numerical computing, in particular with large arrays and matrices; is part of SciPy, but can also be loaded individually
References:
- Nicolas P. Rougier, From Python to Numpy
(large) scientific computing library, based on NumPy arrays (and including NumPy)
machine learning, built to work well with NumPy and SciPy
Pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with “relational” or “labeled” data both easy and intuitive
Matplotlib is a Python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms.
Keras is a high-level neural networks API, written in Python and capable of running on top of either TensorFlow or Theano. It was developed with a focus on enabling fast experimentation. Being able to go from idea to result with the least possible delay is key to doing good research.
- Code Challenge
- Python Tutorial at python.org (quite complete and extensive; probably to read selectively)
- DataCamp has two Python Courses (for Data Science): Intro and Intermediate
- After Hours Programming has an interactive Python Tutorial
- LearnPython.org has an interactive Python Tutorial
- pull every jupyter notebook with number from python-for-data-science
- check the setup part for anaconda & friends (let's say as number 0)
for numpy
- improve didactical structures. Parts are redundant (e.g. operations), parts dont follow perfect logic order (broadcasting)