python-tutorials

This repository contains

installation instructions for a minimal python environment
Jupyter notebooks to introduce the basic concepts of python
quizzes to test the understanding of said concepts

It is related to my follow-up repository data-science-tutorials.

Prerequisites

access to a computing environment with installation rights
programming experience, i.e. familiarity with basic concepts like control flow and data structures

Leitfaden for a possible course

+--Motivation
   Installation Instruction
   Programming Environments
+--Data Types
   Control Flow
   Modules
+--package NumPy
+--package SciPy
+--package scikit-learn
+--package matplotlib (Python plotting, object-oriented)
+--package pandas (Python Data Analysis Library)

Setup

Choosing the proper installation candidate

We use Python 3 via the Anaconda distribution.

Installation Recipe (linux)

download the latest (64-bit) Anaconda3-installer from http://continuum.io/download and launch it with
```
$ bash Anaconda3-1.9.1-Linux-x86_64.sh
```
The installer will then ask ...
- for your agreement on the license agreement
- for a target directory (this is optional; default is ~/anaconda3, my choice is ~/local/share/anaconda3)
- to run the conda init script to check/set some paths (obviously also optional) and if necessary add them to .bashrc (You may already have this, e.g. via .profile)
Next, we'll add three channels to the default one (in this order) :
```
$ conda config --add channels conda-forge
$ conda config --add channels defaults
$ conda config --add channels r
$ conda config --add channels bioconda
```
bioconda is for bioinformatics (what's your requirement?) and will receive the highest priority. r is required for bioinformatics and contains moduls for the GNU R programming language. The defaults channel already contains plenty of packages (?TODO list?). Finally, conda-forge contains several community-build packages that are not already in the default channel.

Updating

The installer is for the full package coming with the anaconda meta-package. Let's update it with :

$ conda update anaconda

Modules/Packages

Here's the full package list.

Anaconda is not only the name of the python-distribution, but also the name of its largest meta-package. To maximize compatibility (and minimize maintenance effort), we have the following priorities

packages from the standard library (see below)
packages from the anaconda meta-package (?packages marked as "In Installer" [here](https://docs.anaconda.com/anaconda/packages/py3.6_linux-64/ ?)
packages from conda's default channel, like keras (but not in the default installer's full list of packages?)
ONLY IF NECESSARY packages from selected additional conda channels (r, bioconda)

Minimal Package List

(for reference, when you have to reproduce your environment outside of anaconda, e.g. in SageMath)

favorites from the standard library

datetime
csv

non-standard packages in conda's default installation

marked "In Installer" (I)

interface
- conda (I)
- jupyter (I)
math
- numpy (I)
- scipy (I)
- statsmodels (I)
data analysis
- pandas (I)
- scikit-learn sklearn (I)
visualization
- matplotlib (I)
- seaborn (I)

non-standard packages in the default channel

keras

OUTOFSCOPE alternative packages

merely listed here for reference, because they might pop up repeatedly

pybrain by Jürgen Schmidhuber et.al. (latest release 0.3 dates from 18 Nov 2009)

Update

:

 $ conda update conda
 $ conda update anaconda

this confuses me (and others), so the current recommendation in anaconda's blog1 for "What 95% of People Want" is :

$ conda update --all
$ conda

Optionally :

$ conda remove anaconda

And if things break :

$ conda clean --all

Development

Command-Line Interface

(default) python interactive shell
MYCHOICE ipython shell

Jupyter Notebook

web-application for interactive python worksheets

Integrated Development Environment

MYCOICE spyder, a quick introduction by Joey Bernard
pycharm
Eclipse with PyDev plugin
Emacs with ...

Note: Spyder is already available in the standard installation, but if we want/need more advanced profiling, there's

   $ conda config --add channels spyder-ide
   $ conda install -c spyder-ide spyder-line-profiler
   $ conda install -c spyder-ide spyder-memory-profiler

Packages

are called modules in python

Favorite Non-Standard Packages

NumPy

fast numerical computing, in particular with large arrays and matrices; is part of SciPy, but can also be loaded individually

References:

Nicolas P. Rougier, From Python to Numpy

SciPy

(large) scientific computing library, based on NumPy arrays (and including NumPy)

scikit-learn scikit-learn-intelex

machine learning, built to work well with NumPy and SciPy -- along with the Intel extension for acceleration

pandas

Pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with “relational” or “labeled” data both easy and intuitive

matplotlib

Matplotlib is a Python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms. Here's a short introduction.

keras

Keras is a high-level neural networks API, written in Python and capable of running on top of either TensorFlow or Theano. It was developed with a focus on enabling fast experimentation. Being able to go from idea to result with the least possible delay is key to doing good research.

Further Non-Standard Libraries

qutip

to simulate quantum systems. Very short introduction by Joey Bernard.

Standard Library (Batteries Included)

see the complete list at https://docs.python.org/3/library/

array

This module defines an object type which can compactly represent an array of basic values: characters, integers, floating point numbers. Arrays are sequence types and behave very much like lists, except that the type of objects stored in them is constrained.

collections

This module implements specialized container datatypes providing alternatives to Python’s general purpose built-in containers, dict, list, set, and tuple.

csv

import/export of csv-files

itertools

This module implements a number of iterator building blocks inspired by constructs from APL, Haskell, and SML. Each has been recast in a form suitable for Python.

math

It provides access to the mathematical functions defined by the C standard.

multiprocessing

multiprocessing is a package that supports spawning processes using an API similar to the threading module.

os

This module provides a portable way of using operating system dependent functionality.

sys

This module provides access to some variables used or maintained by the interpreter and to functions that interact strongly with the interpreter. It is always available.

time

This module provides various time-related functions.

traceback

This module provides a standard interface to extract, format and print stack traces of Python programs. It exactly mimics the behavior of the Python interpreter when it prints a stack trace.

Books

Swaroop, A Byte of Python, CC-BY-SA. (Free pdf and epub download, audience: programming beginners)
Allen B. Downey, Think Python, 2nd edition, 2015. (Cave: 1st edition uses python2. Free pdf and html download, sample code available on webpage and github, audience: python beginners with programming experience)
Idris2016
McKinney2012

(Free) Courses

Codecademy's Python track
Udacity's Intro to Computer Science
MIT's 6.001 Introduction to Computer Science and Programming in Python by Ana Bell, Eric Grimson, and John Guttag; "for students with little or no programming experience"; available on edX
MIT's 6.002 Introduction to Computational Thinking and Data Science by Ana Bell, Eric Grimson, and John Guttag; continuation of MIT 6.001; archived on edX

Quizzes/Exercises/Generic Projects

15 single-choice questions by TripleByte
Mega Project List by Karan Goel
100 days of algorithms by Tomáš Bouda with github repository

TODOs

check the setup part for anaconda & friends (let's say as number 0)
split every notebook into mandatory/optional part
for numpy: improve didactical structures. Parts are redundant (e.g. operations), parts dont follow perfect logic order (broadcasting)
compare to Christin Seifert's minimal tutorial at https://github.com/chseifert/tutorials/blob/master/PythonTutorial/PythonTutorial.ipynb

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
Advanced		Advanced
data		data
exercises		exercises
images		images
0.1-How_to_start.ipynb		0.1-How_to_start.ipynb
0.2-Anaconda_Installation.ipynb		0.2-Anaconda_Installation.ipynb
0.3-Github_repository.ipynb		0.3-Github_repository.ipynb
0.4-Additional_extensions.ipynb		0.4-Additional_extensions.ipynb
DSiP-1-Introduction.ipynb		DSiP-1-Introduction.ipynb
DSiP-2-Python-Programing-Basics.ipynb		DSiP-2-Python-Programing-Basics.ipynb
DSiP-3-NumPy.ipynb		DSiP-3-NumPy.ipynb
DSiP-4-Scipy.ipynb		DSiP-4-Scipy.ipynb
DSiP-5-Matplotlib.ipynb		DSiP-5-Matplotlib.ipynb
DSiP-6-Pandas.ipynb		DSiP-6-Pandas.ipynb
README.md		README.md

zieglerk/python-tutorials

Folders and files

Latest commit

History

Repository files navigation

python-tutorials

Prerequisites

Leitfaden for a possible course

Setup

Choosing the proper installation candidate

Installation Recipe (linux)

Updating

Modules/Packages

Minimal Package List

favorites from the standard library

non-standard packages in conda's default installation

non-standard packages in the default channel

OUTOFSCOPE alternative packages

Update

Development

Command-Line Interface

Jupyter Notebook

Integrated Development Environment

Packages

Favorite Non-Standard Packages

NumPy

SciPy

scikit-learn scikit-learn-intelex

pandas

matplotlib

keras

Further Non-Standard Libraries

qutip

Standard Library (Batteries Included)

array

collections

csv

itertools

math

multiprocessing

os

sys

time

traceback

Other Tutorials/Exercises

Books

(Free) Courses

Quizzes/Exercises/Generic Projects

TODOs

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages