diff --git a/includes/_git-basics.qmd b/includes/_git-basics.qmd new file mode 100644 index 0000000..2932277 --- /dev/null +++ b/includes/_git-basics.qmd @@ -0,0 +1,93 @@ +In our work lives, we regularly work with files, either creating, +editing, moving, copying, or deleting them. These files can be +anything from text documents, to images, to code. When we work on these +files, we often make changes to them, and sometimes many changes. We +might want to keep track of these changes, so we can see *what* we've +done, *when* we did it, *why* we did it, and *who* did it. This is both helpful for +potential collaborators and our future selves. + +If a file has the ability to internally "track changes", like Word +does, you may have used that before, but likely only when getting +feedback from others. On the file level, you may have "tracked changes" +informally by saving multiple versions of a file with different names, +like in the example image below. + +![File naming in the commonly used *informal* 'version +control'.](/images/informal-version-control.jpg) + +Does this way of saving files and keeping track of versions look +familiar? The above image may exaggerate how some people's versioning looks +like, but there is some truth to it: It is the most common approach to +"version control". + +This "informal" version control isn't ideal because it involves multiple +copies of the same file. It makes it difficult to keep track of specific +changes and find the right version of the files. + +There are, however, "formal" version control systems that automatically +manage changes to files. One of the world's most popular version control +systems is called +[Git](https://git-scm.com/book/en/v2/Getting-Started-What-is-Git%3F). +Git is used by millions of people around the world, including thousands +of organisations and researchers. + +With Git you can create snapshots of file changes, known as *commits*. Each commit +captures: + +- What specific changes were made to the file or files. +- Who made the changes to the files. +- When they made the changes to the files. + +Each commit also has a short message attached to it that can +describe *why* the changes were made. + +Git stores these commits in a history log. The history log allows you to +quickly go back and explore the changes made to files, along with a +message describing the changes. This is extremely useful when you +revisit your own work after a long time (because you *will* forget +things) and when you work in groups or with collaborators. + +Git only tracks changes to files *within a specific folder* (and it's +sub-folders). In Git terminology, this folder is called a +**repository** (or a *repo* for short). The best way to use a repository +is to store all files related to a specific project, like a research +project, in this repository (this folder). This way, you can track all +changes made to all files in the project. It keeps things more organised and +self-contained, since everything related to a project is in one place. + +Any type of file can be stored in the repository, including both +code and other non-code based files like Word or images. However, Git has +more features and tools for tracking specific changes when the file is +text-based, like a `.txt`, `.csv`, or code. Since these text-based files +are literally only text characters, it is easier to track the changes to +exact lines of text. Unlike files like images, or Word documents (that +actually aren't just text), there are no "lines" to track changes on. + +To understand how powerful formal version control like Git is, consider about these +questions: + +- How many files of different versions of a scientific document or + thesis do you have laying around after getting feedback from your + supervisor or co-authors? +- Have you ever wanted to test an analysis in a + file but ended up creating a new one to avoid modifying the original? +- Have you ever deleted something and wished you hadn't? +- Have you ever forgotten what you were doing on a project, or why you + chose a particular strategy or analysis? + +All these problems can be fixed by using formal version control! There +are many good reasons to use version control, especially in science: + +- Transparency of work done to demonstrate or substantiate your + scientific claim and protect against accusations of fraud. +- Claim to first discovery, since you have a time-stamped history of + your work. +- Evidence of contributions and work, since who does what is tracked. +- Easier collaboration, because you can work on a single file/folder + in a single central location rather than emailing file versions + around. +- Organized files and folders, since there is one single project + folder and one single version of each file, rather than multiple + versions of the same file. +- Less time spent on finding things related to your projects, because + everything is organized and in one place. diff --git a/preamble/pre-workshop.qmd b/preamble/pre-workshop.qmd index 94c82a9..c224d3b 100644 --- a/preamble/pre-workshop.qmd +++ b/preamble/pre-workshop.qmd @@ -33,87 +33,7 @@ to the concepts of *version control*, *Git*, and *GitHub*. ### What is version control and Git? {#sec-what-is-version-control} -In our work lives, we often create and modify files. These files can be -anything from text documents, to images, to code. When we work on these -files, we often make changes to them, and sometimes many changes. We -might want to keep track of these changes, so we can see *what* we've -done, *when* we did it, and *why* we did it. This is both helpful for -potential collaborators and our future selves. - -Many of us track changes to our files informally, by saving multiple versions of a -file with different names, like the example below in @fig-file-naming-version-control-comics - -![File naming in the commonly used *informal* 'version -control'.](../images/informal-version-control.jpg){#fig-file-naming-version-control-comics} - -Does this way of saving files and keeping track of versions look -familiar? While the above image is maybe a bit of an exaggeration and teasing a bit, there is some truth to -it: It is the most commonly approach to "version control". - -This form of "informal" version control isn't ideal because it involves -multiple copies of the same file. This approach also makes it difficult -to keep track of specific changes - and find the right version of our -files. - -There are, however, "formal" version control systems that automatically -manage changes to files. One of the world's most popular version control -systems is called -[Git](https://git-scm.com/book/en/v2/Getting-Started-What-is-Git%3F). -Git is used by millions of people around the world, including thousands -of organisations and researchers. - -Git creates snapshots of file changes, known as *commits*. Each commit -captures: - -- the specific changes -- who made the changes -- when they made the changes - -Each commit also has a short message attached to it that should describe -what changes were made and why. - -Git stores these commits in a history log. The history log allows you to -quickly go back and see what and when a change happened along with the message -describing the changes. This is extremely useful when you revisit your -own work after a long time (because you *will* forget things) and when -you work in groups or with collaborators. - -Git only tracks changes to files *within a project*. In Git terminology, -a project is called a **repository** (or a *repo* for short). The -repository is the place where you store all the files in your project. -This includes both code and other non-code based files like Word or -images. Though, Git does tend to be better at tracking specific changes in -text-based files, which makes sense since it's easier to track the -change in some lines of text, rather than changes to e.g., an image. - -To understand how incredibly powerful "formal" version control like Git -is, think about these questions: - -- How many files of different versions of a scientific documents or - thesis do you have laying around after getting feedback from your - supervisor or co-authors? -- Have you ever wanted to test out an analysis with some code in a file and need to make a - new file so that the original is not modified? -- Have you ever deleted something and wished you hadn't? -- Have you ever forgotten what you were doing on a project, or why you - chose a particular strategy or analysis? - -All these problems can be fixed by using formal version control! There -are so many good reasons to use version control, especially in science: - -- Transparency of work done to demonstrate or substantiate your - scientific claim. -- Claim to first discovery, since you have a time-stamped history of - your work. -- Defense against fraud, because of the transparency. -- Evidence of contributions and work, since who does what is tracked. -- Easy collaboration, because you can work on a single file/folder - rather than emailing file versions around. -- Organized files and folders, since there is one single project - folder and one single version of each file, rather than multiple - versions of the same file. -- Less time spent on finding things related to your projects, because - everything is organized and in one place. +{{< include /includes/_git-basics.qmd >}} ### What is GitHub then? {#sec-what-is-git-and-github} diff --git a/sessions/basics.qmd b/sessions/basics.qmd index f63723d..0d5628b 100644 --- a/sessions/basics.qmd +++ b/sessions/basics.qmd @@ -29,6 +29,36 @@ engaging in the material. So: 2. For 4 minutes, pair up with your neighbour and take turns explaining to them what you remember, 2 minutes each. +## :book: Reading task: What is version control and Git? + +::: {.callout-note collapse="true"} +## :teacher: Instructor note + +After they've read it, take some time to repeat some key points from the +text, such as: + +- Emphasising how people usually version files. +- Highlighting that Git can track any file type, but that Git has more + features for text-based files. +- Reinforcing what "plain text" files are. +::: + +**Time: \~5 minutes.** + +::: callout-note +This is the same text you read for the pre-workshop tasks. Why are we +getting you to do a discussion activity to remember Git, then getting +you to again re-read the text? We're doing this because we know that +learning and understanding Git can be *very hard*, so we're giving you +repeated exposures to it. Git requires you work with files in a +fundamentally different way than people usually work, so we've learned to give learners lots of opportunities +to try to understand what Git does and why it is useful. By including these repeated exposures we want to help reinforce these concepts. +::: + +{{< include /includes/_git-basics.qmd >}} + +{{< text_snippet sticky_up >}} + ## :book: Reading task: What is a file path? **Time: \~3 minutes.**