deploy: 65027d5

coderefinery · Sep 12, 2024 · cd93c1a · cd93c1a
commit cd93c1a
Show file tree

Hide file tree

Showing 267 changed files with 41,328 additions and 0 deletions.
diff --git a/.nojekyll b/.nojekyll
diff --git a/_images/animation.mp4 b/_images/animation.mp4
diff --git a/_images/stars.png b/_images/stars.png
diff --git a/_sources/documentation.md.txt b/_sources/documentation.md.txt
@@ -0,0 +1,8 @@
+(documentation)=
+
+# Code documentation
+
+- 15 min: What makes good documentation? Also https://diataxis.fr/
+- 15 min: Writing good README files
+- 30 min: Exercise: Set up a Sphinx documentation and add API documentation
+- 15 min: Building documentation with GitHub Actions
diff --git a/_sources/example.md.txt b/_sources/example.md.txt
@@ -0,0 +1,95 @@
+(example-project)=
+
+# Example project: Simulating the motion of planets
+
+The [example code](https://github.com/workshop-material/planets) that we will study
+is a hopefully simple N-body simulation written in Python. It is not important
+or expected that we understand the code in any detail.
+
+:::{video} video/animation.mp4
+:width: 600
+:::
+
+The **big picture** is that the code simulates the motion of a number of
+planets:
+- We can choose the number of planets.
+- Each planet starts with a random position, velocity, and mass.
+- At each time step, the code calculates the gravitational force between each
+  pair of planets.
+- The forces accelerate each planet, the acceleration modifies the velocity,
+  the velocity modifies the position of each planet.
+- We can choose the number of time steps.
+- The units were chosen to make numbers easy to read.
+
+
+## Example run
+
+:::{instructor-note}
+The instructor demonstrates running the code on their computer.
+:::
+
+The code is written to accept **command-line arguments** to specify the number
+of planets and the number of time steps.
+
+We first generate starting data:
+```console
+$ python generate-data.py --num-planets 10 --output-file initial.csv
+```
+
+The generated file (initial.csv) could look like this:
+```
+px,py,pz,vx,vy,vz,mass
+-46.88,-42.51,88.33,-0.86,-0.18,0.55,6.70
+-5.29,17.09,-96.13,0.66,0.45,-0.17,3.51
+83.53,-92.83,-68.77,-0.26,-0.48,0.24,6.84
+-36.31,25.48,64.16,0.85,0.75,-0.56,1.53
+-68.38,-17.21,-97.07,0.60,0.26,0.69,6.63
+-48.37,-48.74,3.92,-0.92,-0.33,-0.93,8.60
+40.53,-75.50,44.18,-0.62,-0.31,-0.53,8.04
+-27.21,10.78,-78.82,-0.09,-0.55,-0.03,5.35
+88.42,-74.95,-45.85,0.81,0.68,0.56,5.36
+39.09,53.12,-59.54,-0.54,0.56,0.07,8.98
+```
+
+Then we can simulate their motion (in this case for 20 steps):
+```console
+$ python simulate.py --num-steps 20 \
+                     --input-file initial.csv \
+                     --output-file final.csv
+```
+
+The `--output-file` (final.csv) is again a CSV file (comma-separated values)
+and contains the final positions of all planets.
+
+It is possible to run on **multiple cores** and to **animate** the result.
+Here is an example with 100 planets:
+```{code-block} console
+---
+emphasize-lines: 7,11
+---
+$ python generate-data.py --num-planets 100 --output-file initial.csv
+
+$ python simulate.py --num-steps 50 \
+                     --input-file initial.csv \
+                     --output-file final.csv \
+                     --trajectories-file trajectories.npz \
+                     --num-cores 8
+
+$ python animate.py --initial-file initial.csv \
+                    --trajectories-file trajectories.npz \
+                    --output-file animation.mp4
+```
+
+:::{admonition} Learning goals
+- What are the most important steps to make this code **reusable by others**
+  and **our future selves**?
+- Be able to apply these techniques to your own code/script.
+:::
+
+:::{admonition} We will not focus on ...
+- ... how the code works internally in detail.
+- ... whether this is the most efficient algorithm.
+- ... whether the code is numerically stable.
+- ... how to code scales with the number of cores.
+- ... whether it is portable to other operating systems (we will discuss this later).
+:::
diff --git a/_sources/index.md.txt b/_sources/index.md.txt
@@ -0,0 +1,156 @@
+# Reproducible research software development using Python
+
+
+## Big-picture goal
+
+This is a **hands-on course on research software engineering**. In this
+workshop we assume that most workshop participants use Python in their work or
+a leading a group which uses Python.  Therefore, some of the examples will use
+Python as the example language.
+
+We will work with an example project and go through all the steps of a typical
+software project.  Once we have seen the building blocks, we will try to apply
+them to own projects. Workshop participants will receive and also learn to give
+constructive code feedback.
+
+
+## Prerequisites
+
+:::{prereq} Preparation
+1. Get a **GitHub account** following [these instructions](https://coderefinery.github.io/installation/github/).
+1. You will need a **text editor**. If you don't have a favorite one, we recommend
+   [VS Code](https://coderefinery.github.io/installation/vscode/).
+1. **If you prefer to work in the terminal** and not in VS Code, set up these two (skip this if you use VS Code):
+   - [Git in the terminal](https://coderefinery.github.io/installation/git-in-terminal/)
+   - [SSH or HTTPS connection to GitHub from terminal](https://coderefinery.github.io/installation/ssh/)
+1. **One of these two software environments** (if you are not sure which one to
+   choose or have no preference, choose Conda):
+   - {ref}`conda`
+   - {ref}`venv` (Snakemake is not available in this environment)
+1. **Optional** and only on Linux: [Apptainer](https://apptainer.org/) following
+   [these instructions](https://apptainer.org/docs/admin/1.3/installation.html#install-from-pre-built-packages).
+:::
+
+
+## Schedule
+
+:::{note}
+The schedule will very soon contain links to lesson material and exercises.
+:::
+
+
+### Day 1 (Sep 16)
+
+- 13:00-13:30 (0.5h) - **Welcome and introduction**
+  - Motivation (reproducibility, robustness, distribution, improvement, trust, etc.)
+  - Practical information (tools, communication, breaks, etc.)
+  - What will learn and achieve from this course?
+  - {ref}`example-project`
+
+- 13:30-14:45 (1.25h) - **Introduction to version control with Git and GitHub (1/2)**
+  - Creating a repository and porting your project to Git and GitHub
+  - Basic commands
+
+- 15:00-16:30 (1.5h) - **Introduction to version control with Git and GitHub (2/2)**
+  - Branching and merging
+  - Recovering from typical mistakes
+
+- 16:45-18:00 (1.25h) - {ref}`documentation`
+  - In-code documentation including docstrings
+  - Writing good README files
+  - Markdown
+  - Sphinx
+  - Building documentation with GitHub Actions
+  - Jupyter Notebooks
+
+
+### Day 2 (Sep 17)
+
+- 09:00-10:30 (1.5h) - **Collaborative version control and code review (1/2)**
+  - Practice code review using issues and pull requests
+  - Forking workflow
+  - Contributing changes to projects of others
+
+- 10:45-12:15 (1.5h) - **Collaborative version control and code review (2/2)**
+  - Organization strategies
+  - Merge vs. rebase
+  - Conflict resolution
+
+- 16:45-18:00 (1.25h) - **Debriefing and Q&A**
+  - Participants work on their projects
+  - Together we study actual codes that participants wrote or work on
+  - Constructively we discuss possible improvements
+  - Give individual feedback on code projects
+
+
+### Day 3 (Sep 18)
+
+- 09:00-10:30 (1.5h) - {ref}`testing`
+  - Unit tests
+  - End-to-end tests
+  - pytest
+  - GitHub Actions
+
+- 10:45-12:15 (1.5h) - {ref}`reusable`
+  - Tracking dependencies with requirements.txt and environment.yml
+  - Recording environments in containers
+
+- 13:00-14:45 (1.75h) - {ref}`refactoring`
+  - Naming (and other) conventions, project organization, modularity
+  - Refactoring (explained through examples)
+  - Design patterns: functional design vs. object-oriented design
+  - How to design your code before writing it
+  - Structuring larger software projects in a modular way
+  - Command-line interfaces
+  - Workflows with Snakemake
+
+- 15:00-16:30 (1.5h) - {ref}`publishing`
+  - Licenses
+  - Publishing the code via Zenodo
+  - Packaging the code
+  - Sharing the code via PyPI
+
+- 16:45-18:00 (1.25h) - **Debriefing and Q&A**
+  - Participants work on their projects
+  - Together we study actual codes that participants wrote or work on
+  - Constructively we discuss possible improvements
+  - Give individual feedback on code projects
+
+
+### Extra material if we have time
+
+- Profiling memory and CPU usage
+- Strategies for parallelization
+
+
+```{toctree}
+:maxdepth: 1
+:caption: Software environment
+:hidden:
+
+installation/conda
+installation/virtual-environment
+```
+
+```{toctree}
+:maxdepth: 1
+:caption: Episodes
+:hidden:
+
+example
+documentation
+testing
+reusable
+refactoring
+publishing
+```
+
+```{toctree}
+:maxdepth: 1
+:caption: Reference
+:hidden:
+
+All lessons <https://coderefinery.org/lessons/>
+CodeRefinery <https://coderefinery.org/>
+Reusing <https://coderefinery.org/lessons/reusing/>
+```
diff --git a/_sources/installation/conda.md.txt b/_sources/installation/conda.md.txt
@@ -0,0 +1,87 @@
+(conda)=
+
+# Conda environment
+
+A Conda environment is an isolated software environment that is used to manage dependencies for a project
+and you decide where it is located.
+
+You will need a `environment.yml` file that documents the dependencies:
+```yaml
+name: coderefinery
+channels:
+  - conda-forge
+  - bioconda
+dependencies:
+  - python >= 3.10
+  - black
+  - click
+  - flit
+  - ipywidgets
+  - isort
+  - jupyterlab
+  - jupyterlab_code_formatter
+  - jupyterlab-git
+  - matplotlib
+  - myst-parser
+  - nbdime
+  - numpy
+  - pandas
+  - pytest
+  - pytest-cov
+  - scalene
+  - seaborn
+  - snakemake-minimal
+  - sphinx
+  - sphinx-autoapi
+  - sphinx-autobuild
+  - sphinx_rtd_theme >= 2.0
+  - vulture
+  - scikit-image
+```
+
+
+## Before you create a virtual environment
+
+1. Create a new directory for this course.
+1. In this directory, create an `environment.yml` file and copy-paste the dependencies above into it.
+
+
+## Choose the tool to manage the environment
+
+If you are already using one of these tools, please continue using the tool that you like and know.
+If you are new to this, **we recommend using Miniconda or Miniforge**.
+
+- [Anaconda](https://docs.anaconda.com/anaconda/install/)
+  - Advantages: easy to install, easy to use, good for beginners
+  - Disadvantages: large download, installs more than we will need, license restrictions
+- [Miniconda](https://docs.anaconda.com/miniconda/)
+  - Advantages: small size, installs only what you need
+  - Disadvantages: no graphical interface, license restrictions
+- [Miniforge](https://github.com/conda-forge/miniforge)
+  - Advantages: small size, no license restrictions
+  - Disadvantages: no graphical interface
+- [Micromamba](https://mamba.readthedocs.io/en/latest/user_guide/micromamba.html)
+  - Advantages: fast, small size
+  - Disadvantages: no graphical interface
+- [Pixi](https://pixi.sh/latest/)
+  - Advantages: fast and new
+  - Disadvantages: new and less tested and not documented here
+
+
+## Creating the virtual environment
+
+1. Open your terminal shell (e.g. Bash or Zsh).
+2. Activate `conda` using `conda activate` or `source ~/miniconda3/bin/activate`.
+3. Run the following command:
+   ```console
+   $ conda env create --file environment.yml
+   ```
+4. Make sure that you see "coderefinery" in the output when you ask for a list of all available environments:
+   ```console
+   $ conda env list
+   ```
+
+
+## How to verify that this worked
+
+(this will be added)