Skip to content

Commit

Permalink
docs(sessions): 📝 move Git basics into includes, plus do some revisio…
Browse files Browse the repository at this point in the history
…ns (#70)

## Description

Related to #61

<!-- Please delete as appropriate: -->
This PR needs an in-depth review.

## Checklist

- [x] Formatted Markdown
- [x] Ran `just run-all`

---------

Co-authored-by: Signe Kirk Brødbæk <40836345+signekb@users.noreply.github.com>
  • Loading branch information
lwjohnst86 and signekb authored Feb 27, 2025
1 parent 1641f2d commit ec372ef
Show file tree
Hide file tree
Showing 3 changed files with 124 additions and 81 deletions.
93 changes: 93 additions & 0 deletions includes/_git-basics.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
In our work lives, we regularly work with files, either creating,
editing, moving, copying, or deleting them. These files can be
anything from text documents, to images, to code. When we work on these
files, we often make changes to them, and sometimes many changes. We
might want to keep track of these changes, so we can see *what* we've
done, *when* we did it, *why* we did it, and *who* did it. This is both helpful for
potential collaborators and our future selves.

If a file has the ability to internally "track changes", like Word
does, you may have used that before, but likely only when getting
feedback from others. On the file level, you may have "tracked changes"
informally by saving multiple versions of a file with different names,
like in the example image below.

![File naming in the commonly used *informal* 'version
control'.](/images/informal-version-control.jpg)

Does this way of saving files and keeping track of versions look
familiar? The above image may exaggerate how some people's versioning looks
like, but there is some truth to it: It is the most common approach to
"version control".

This "informal" version control isn't ideal because it involves multiple
copies of the same file. It makes it difficult to keep track of specific
changes and find the right version of the files.

There are, however, "formal" version control systems that automatically
manage changes to files. One of the world's most popular version control
systems is called
[Git](https://git-scm.com/book/en/v2/Getting-Started-What-is-Git%3F).
Git is used by millions of people around the world, including thousands
of organisations and researchers.

With Git you can create snapshots of file changes, known as *commits*. Each commit
captures:

- What specific changes were made to the file or files.
- Who made the changes to the files.
- When they made the changes to the files.

Each commit also has a short message attached to it that can
describe *why* the changes were made.

Git stores these commits in a history log. The history log allows you to
quickly go back and explore the changes made to files, along with a
message describing the changes. This is extremely useful when you
revisit your own work after a long time (because you *will* forget
things) and when you work in groups or with collaborators.

Git only tracks changes to files *within a specific folder* (and it's
sub-folders). In Git terminology, this folder is called a
**repository** (or a *repo* for short). The best way to use a repository
is to store all files related to a specific project, like a research
project, in this repository (this folder). This way, you can track all
changes made to all files in the project. It keeps things more organised and
self-contained, since everything related to a project is in one place.

Any type of file can be stored in the repository, including both
code and other non-code based files like Word or images. However, Git has
more features and tools for tracking specific changes when the file is
text-based, like a `.txt`, `.csv`, or code. Since these text-based files
are literally only text characters, it is easier to track the changes to
exact lines of text. Unlike files like images, or Word documents (that
actually aren't just text), there are no "lines" to track changes on.

To understand how powerful formal version control like Git is, consider about these
questions:

- How many files of different versions of a scientific document or
thesis do you have laying around after getting feedback from your
supervisor or co-authors?
- Have you ever wanted to test an analysis in a
file but ended up creating a new one to avoid modifying the original?
- Have you ever deleted something and wished you hadn't?
- Have you ever forgotten what you were doing on a project, or why you
chose a particular strategy or analysis?

All these problems can be fixed by using formal version control! There
are many good reasons to use version control, especially in science:

- Transparency of work done to demonstrate or substantiate your
scientific claim and protect against accusations of fraud.
- Claim to first discovery, since you have a time-stamped history of
your work.
- Evidence of contributions and work, since who does what is tracked.
- Easier collaboration, because you can work on a single file/folder
in a single central location rather than emailing file versions
around.
- Organized files and folders, since there is one single project
folder and one single version of each file, rather than multiple
versions of the same file.
- Less time spent on finding things related to your projects, because
everything is organized and in one place.
82 changes: 1 addition & 81 deletions preamble/pre-workshop.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -33,87 +33,7 @@ to the concepts of *version control*, *Git*, and *GitHub*.

### What is version control and Git? {#sec-what-is-version-control}

In our work lives, we often create and modify files. These files can be
anything from text documents, to images, to code. When we work on these
files, we often make changes to them, and sometimes many changes. We
might want to keep track of these changes, so we can see *what* we've
done, *when* we did it, and *why* we did it. This is both helpful for
potential collaborators and our future selves.

Many of us track changes to our files informally, by saving multiple versions of a
file with different names, like the example below in @fig-file-naming-version-control-comics

![File naming in the commonly used *informal* 'version
control'.](../images/informal-version-control.jpg){#fig-file-naming-version-control-comics}

Does this way of saving files and keeping track of versions look
familiar? While the above image is maybe a bit of an exaggeration and teasing a bit, there is some truth to
it: It is the most commonly approach to "version control".

This form of "informal" version control isn't ideal because it involves
multiple copies of the same file. This approach also makes it difficult
to keep track of specific changes - and find the right version of our
files.

There are, however, "formal" version control systems that automatically
manage changes to files. One of the world's most popular version control
systems is called
[Git](https://git-scm.com/book/en/v2/Getting-Started-What-is-Git%3F).
Git is used by millions of people around the world, including thousands
of organisations and researchers.

Git creates snapshots of file changes, known as *commits*. Each commit
captures:

- the specific changes
- who made the changes
- when they made the changes

Each commit also has a short message attached to it that should describe
what changes were made and why.

Git stores these commits in a history log. The history log allows you to
quickly go back and see what and when a change happened along with the message
describing the changes. This is extremely useful when you revisit your
own work after a long time (because you *will* forget things) and when
you work in groups or with collaborators.

Git only tracks changes to files *within a project*. In Git terminology,
a project is called a **repository** (or a *repo* for short). The
repository is the place where you store all the files in your project.
This includes both code and other non-code based files like Word or
images. Though, Git does tend to be better at tracking specific changes in
text-based files, which makes sense since it's easier to track the
change in some lines of text, rather than changes to e.g., an image.

To understand how incredibly powerful "formal" version control like Git
is, think about these questions:

- How many files of different versions of a scientific documents or
thesis do you have laying around after getting feedback from your
supervisor or co-authors?
- Have you ever wanted to test out an analysis with some code in a file and need to make a
new file so that the original is not modified?
- Have you ever deleted something and wished you hadn't?
- Have you ever forgotten what you were doing on a project, or why you
chose a particular strategy or analysis?

All these problems can be fixed by using formal version control! There
are so many good reasons to use version control, especially in science:

- Transparency of work done to demonstrate or substantiate your
scientific claim.
- Claim to first discovery, since you have a time-stamped history of
your work.
- Defense against fraud, because of the transparency.
- Evidence of contributions and work, since who does what is tracked.
- Easy collaboration, because you can work on a single file/folder
rather than emailing file versions around.
- Organized files and folders, since there is one single project
folder and one single version of each file, rather than multiple
versions of the same file.
- Less time spent on finding things related to your projects, because
everything is organized and in one place.
{{< include /includes/_git-basics.qmd >}}

### What is GitHub then? {#sec-what-is-git-and-github}

Expand Down
30 changes: 30 additions & 0 deletions sessions/basics.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,36 @@ engaging in the material. So:
2. For 4 minutes, pair up with your neighbour and take turns explaining
to them what you remember, 2 minutes each.

## :book: Reading task: What is version control and Git?

::: {.callout-note collapse="true"}
## :teacher: Instructor note

After they've read it, take some time to repeat some key points from the
text, such as:

- Emphasising how people usually version files.
- Highlighting that Git can track any file type, but that Git has more
features for text-based files.
- Reinforcing what "plain text" files are.
:::

**Time: \~5 minutes.**

::: callout-note
This is the same text you read for the pre-workshop tasks. Why are we
getting you to do a discussion activity to remember Git, then getting
you to again re-read the text? We're doing this because we know that
learning and understanding Git can be *very hard*, so we're giving you
repeated exposures to it. Git requires you work with files in a
fundamentally different way than people usually work, so we've learned to give learners lots of opportunities
to try to understand what Git does and why it is useful. By including these repeated exposures we want to help reinforce these concepts.
:::

{{< include /includes/_git-basics.qmd >}}

{{< text_snippet sticky_up >}}

## :book: Reading task: What is a file path?

**Time: \~3 minutes.**
Expand Down

0 comments on commit ec372ef

Please sign in to comment.