diff --git a/02-data-visualisation.md b/02-data-visualisation.md index 4549264d..93902a3b 100644 --- a/02-data-visualisation.md +++ b/02-data-visualisation.md @@ -42,7 +42,7 @@ The advantage of `ggplot2` is that it allows R users to create publication quali We will start by reading the data into R, from the `data` folder you placed them in the last part of the introduction. -```r +``` r penguins <- read.csv("data/penguins.csv") ``` @@ -81,7 +81,7 @@ The dataset contains the following fields: `ggplot2` is a core member of `tidyverse` family of packages. Installing and loading the package under the same name will load all of the packages we will need for this workshop. Lets get started! -```r +``` r # install.packages("tidyverse") library(tidyverse) ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ── @@ -101,7 +101,7 @@ Here's a question that we would like to answer using `penguins` data: _Do pengui To plot `penguins`, run the following code in the R-chunk or in console. The following code will put `bill_depth_mm` on the x-axis and `bill_length_mm` on the y-axis: -```r +``` r ggplot(data = penguins) + geom_point( mapping = aes(x = bill_depth_mm, @@ -138,7 +138,7 @@ The* `penguins` *dataset has a column called `year`, which should appear on the ## Solution -```r +``` r ggplot(data = penguins) + geom_point( mapping = aes(x = year, @@ -160,7 +160,7 @@ Try a different `geom_` function called `geom_jitter`. How is that different fro ## Solution -```r +``` r ggplot(data = penguins) + geom_jitter( mapping = aes(x = year, @@ -182,7 +182,7 @@ What if we want to combine graphs from the previous two challenges and show the The following graph maps `island` variable from `penguins` dataset to the `colour` aesthetic of the plot. Let's take a look: -```r +``` r ggplot(data = penguins) + geom_jitter( mapping = aes(x = bill_depth_mm, @@ -203,7 +203,7 @@ What will happen if you switch colour to also be by year? Is the graph still use ## Solution -```r +``` r ggplot(data = penguins) + geom_jitter( mapping = aes(x = bill_depth_mm, @@ -224,7 +224,7 @@ Island is categorical character variable with a discrete range of possible value There are other aesthetics that can come handy. One of them is `size`. The idea is that we can vary the size of data points to illustrate another continuous variable, such as species bill depth. Lets look at four dimensions at once! -```r +``` r ggplot(data = penguins) + geom_jitter( mapping = aes(x = bill_depth_mm, @@ -241,7 +241,7 @@ Warning: Removed 2 rows containing missing values or values outside the scale ra It might be even better to try another type of aesthetic, like shape, for categorical data like species. -```r +``` r ggplot(data = penguins) + geom_jitter( mapping = aes(x = bill_depth_mm, @@ -264,7 +264,7 @@ This is called "setting" in the ggplot2-world. We "map" aesthetics to data colum Here's our initial graph with all colours coloured in blue. -```r +``` r ggplot(data = penguins) + geom_point( mapping = aes(x = bill_depth_mm, @@ -291,7 +291,7 @@ Change the transparency (alpha) of the data points by year. ## Solution -```r +``` r ggplot(data = penguins) + geom_point( mapping = aes(x = bill_depth_mm, @@ -315,7 +315,7 @@ Move the transparency outside the `aes()` and set it to `0.5`. What can we benef ## Solution -```r +``` r ggplot(data = penguins) + geom_point( mapping = aes(x = bill_depth_mm, @@ -339,7 +339,7 @@ Next, we will consider different options for `geoms`. Using different `geom_` fu A useful geom function is `geom_boxplot()`. It adds a layer with the "box and whiskers" plot illustrating the distribution of values within categories. The following chart breaks down bill length by island, where the box represents first and third quartile (the 25th and 75th percentiles), the middle bar signifies the median value and the whiskers extent to cover 95% confidence interval. Outliers (outside of the 95% confidence interval range) are shown separately. -```r +``` r ggplot(data = penguins) + geom_boxplot( mapping = aes(x = species, @@ -354,7 +354,7 @@ Warning: Removed 2 rows containing non-finite outside the scale range Layers can be added on top of each other. In the following graph we will place the boxplots **over** jittered points to see the distribution of outliers more clearly. We can map two aesthetic properties to the same variable. Here we will also use different colour for each island. -```r +``` r ggplot(data = penguins) + geom_jitter( mapping = aes(x = species, @@ -376,7 +376,7 @@ Warning: Removed 2 rows containing missing values or values outside the scale ra Now, this was slightly inefficient due to duplication of code - we had to specify the same mappings for two layers. To avoid it, you can move common arguments of `geom_` functions to the main `ggplot()` function. In this case every layer will "inherit" the same arguments, specified in the "parent" function. -```r +``` r ggplot(data = penguins, mapping = aes(x = island, y = bill_length_mm) @@ -396,7 +396,7 @@ You can still add layer-specific mappings or other arguments by specifying them We can use linear models to highlight differences in dependency between bill length and body mass by island. Notice that we added a separate argument to the `geom_smooth()` function to specify the type of model we want `ggplot2` to built using the data (linear model). The `geom_smooth()` function has also helpfully provided confidence intervals, indicating "goodness of fit" for each model (shaded gray area). For more information on statistical models, please refer to help (by typing `?geom_smooth`) -```r +``` r ggplot(data = penguins, mapping = aes(x = bill_depth_mm, y = bill_length_mm) @@ -422,7 +422,7 @@ Modify the plot so the the points are coloured by island, but there is a single ## Solution -```r +``` r ggplot(data = penguins, mapping = aes(x = bill_depth_mm, y = bill_length_mm)) + @@ -454,7 +454,7 @@ Add another geom! ## Solution -```r +``` r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm)) + @@ -491,7 +491,7 @@ Lets say, the last plot we made, we want to understand if there are also differe In ggplot2, this is called a "facet", and the function we use is called either `facet_wrap` or `facet_grid`. -```r +``` r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm, @@ -515,7 +515,7 @@ This plot looks a little crazy though, as we have penguins with missing sex info Let us swap the places of species and sex. -```r +``` r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm, @@ -548,7 +548,7 @@ Add another facet variable with the `+` ## Solution -```r +``` r ggplot(penguins, aes(x = bill_depth_mm, y = bill_length_mm, diff --git a/03-data-subsetting.md b/03-data-subsetting.md index 92b9b4b4..40c38755 100644 --- a/03-data-subsetting.md +++ b/03-data-subsetting.md @@ -57,7 +57,7 @@ Let us first talk about selecting columns. In {dplyr}, the function name for sel We first need to make sure we have the tidyverse loaded and the penguins data set at hand. -```r +``` r library(tidyverse) penguins <- read_csv("data/penguins.csv") ``` @@ -65,11 +65,11 @@ penguins <- read_csv("data/penguins.csv") To select data, we must first tell select which data set we are selecting from, and then give it our selection. Here, we are asking R to `select()` from the `penguins` data set the `island`, `species` and `sex` columns -```r +``` r select(penguins, island, species, sex) ``` -```output +``` output # A tibble: 344 × 3 island species sex @@ -89,11 +89,11 @@ select(penguins, island, species, sex) When we use `select()` we don't need to use quotations, we write in the names directly. We can also use the numeric indexes for the column, if we are 100% certain of the order of the columns: -```r +``` r select(penguins, 1:3, 6) ``` -```output +``` output # A tibble: 344 × 4 species island bill_length_mm body_mass_g @@ -114,11 +114,11 @@ In some cases, we want to remove columns, and not necessarily state all columns Select also allows for this by adding a minus (`-`) sign in front of the column name you don't want. -```r +``` r select(penguins, -bill_length_mm, -bill_depth_mm) ``` -```output +``` output # A tibble: 344 × 6 species island flipper_length_mm body_mass_g sex year @@ -143,11 +143,11 @@ Select the columns sex, year, and species from the penguins dataset. ## Solution -```r +``` r select(penguins, sex, year, species) ``` -```output +``` output # A tibble: 344 × 3 sex year species @@ -176,11 +176,11 @@ Change your selection so that species comes before sex. What is the difference i ## Solution -```r +``` r select(penguins, species, sex, year) ``` -```output +``` output # A tibble: 344 × 3 species sex year @@ -212,11 +212,11 @@ We could of course type them all out, but the penguins data set has names that m Here, we use a tidy-selector `ends_with()`, can you guess what it does? yes, it looks for columns that end with the string you provide it, here `"mm"`. -```r +``` r select(penguins, ends_with("mm")) ``` -```output +``` output # A tibble: 344 × 3 bill_length_mm bill_depth_mm flipper_length_mm @@ -244,11 +244,11 @@ If you are working with a well named data set, these functions should make your Lets only pick the measurements of the bill, we are not so interested in the flipper. Then we might want to change to `starts_with()` in stead. -```r +``` r select(penguins, starts_with("bill")) ``` -```output +``` output # A tibble: 344 × 2 bill_length_mm bill_depth_mm @@ -268,11 +268,11 @@ select(penguins, starts_with("bill")) The tidy selector can be combined with each other and other selectors. So you can build exactly the data you want! -```r +``` r select(penguins, island, species, year, starts_with("bill")) ``` -```output +``` output # A tibble: 344 × 5 island species year bill_length_mm bill_depth_mm @@ -297,11 +297,11 @@ Select all columns containing an underscore ("_"). ## Solution -```r +``` r select(penguins, contains("_")) ``` -```output +``` output # A tibble: 344 × 4 bill_length_mm bill_depth_mm flipper_length_mm body_mass_g @@ -331,11 +331,11 @@ Select the species and sex columns, in addition to all columns ending with "mm" ## Solution -```r +``` r select(penguins, species, sex, ends_with("mm")) ``` -```output +``` output # A tibble: 344 × 5 species sex bill_length_mm bill_depth_mm flipper_length_mm @@ -364,11 +364,11 @@ De-select all the columns with bill measurements ## Solution -```r +``` r select(penguins, -starts_with("bill")) ``` -```output +``` output # A tibble: 344 × 6 species island flipper_length_mm body_mass_g sex year @@ -395,11 +395,11 @@ The last tidy-selector we'll mention is `where()`. `where()` is a very special t Say you are running a correlation analysis. For correlations, you need all the columns in your data to be numeric, as you cannot correlate strings or categories. Going through each individual column and seeing if it is numeric is a bit of a chore. That is where `where()` comes in! -```r +``` r select(penguins, where(is.numeric)) ``` -```output +``` output # A tibble: 344 × 5 bill_length_mm bill_depth_mm flipper_length_mm body_mass_g year @@ -420,29 +420,29 @@ Magic! Let's break that down. `is.numeric()` is a function in R that checks if a vector is numeric. If the vector is numeric, it returns `TRUE` if not it returns `FALSE`. -```r +``` r is.numeric(5) ``` -```output +``` output [1] TRUE ``` -```r +``` r is.numeric("something") ``` -```output +``` output [1] FALSE ``` Let us look at the penguins data set again -```r +``` r penguins ``` -```output +``` output # A tibble: 344 × 8 species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g @@ -467,11 +467,11 @@ Species and island are factors, while bill columns are "double" which is a decim `where()` goes through all the columns and checks if they are numeric, and returns the ones that are. -```r +``` r select(penguins, where(is.numeric)) ``` -```output +``` output # A tibble: 344 × 5 bill_length_mm bill_depth_mm flipper_length_mm body_mass_g year @@ -496,11 +496,11 @@ Select only the columns that are factors from the `penguins` data set. ## Solution -```r +``` r select(penguins, where(is.factor)) ``` -```output +``` output # A tibble: 344 × 3 species island sex @@ -529,11 +529,11 @@ Select the columns `island`, `species`, as well as all numeric columns from the ## Solution -```r +``` r select(penguins, island, species, where(is.numeric)) ``` -```output +``` output # A tibble: 344 × 7 island species bill_length_mm bill_depth_mm flipper_length_mm body_mass_g @@ -570,11 +570,11 @@ Now, using `is.numeric` on a row won't help, because every row-value in a column So what can we do? Well, we can check if the values meet certain criteria or not. Like values being above 20, or factors being a specific factor. -```r +``` r filter(penguins, body_mass_g < 3000) ``` -```output +``` output # A tibble: 9 × 8 species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g @@ -594,11 +594,11 @@ Here, we've filtered so that we only have observations where the body mass was l We can also filter for specific values, but beware! you must use double equals (`==`) for comparisons, as single equals (`=`) are for argument names in functions. -```r +``` r filter(penguins, body_mass_g == 2900) ``` -```output +``` output # A tibble: 4 × 8 species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g @@ -620,11 +620,11 @@ Filter the data so you only have observations from the "Dream" island. ## Solution -```r +``` r filter(penguins, island == "Dream") ``` -```output +``` output # A tibble: 124 × 8 species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g @@ -653,11 +653,11 @@ Filter the data so you only have observations after 2008 ## Solution -```r +``` r filter(penguins, year >= 2008) ``` -```output +``` output # A tibble: 234 × 8 species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g @@ -684,13 +684,13 @@ Many times, we will want to have several filters applied at once. What if you on `filter()` can take as many statements as you want! Combine them by adding commas (,) between each statement, and that will work as 'and'. -```r +``` r filter(penguins, species == "Chinstrap", body_mass_g < 3000) ``` -```output +``` output # A tibble: 2 × 8 species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g @@ -701,13 +701,13 @@ filter(penguins, You can also use the `&` sign, which in R is the comparison character for 'and', like `==` is for 'equals'. -```r +``` r filter(penguins, species == "Chinstrap" & body_mass_g < 3000) ``` -```output +``` output # A tibble: 2 × 8 species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g @@ -720,14 +720,14 @@ Here we are filtering the penguins data set keeping only the species "Chinstrap" And we can keep going! -```r +``` r filter(penguins, species == "Chinstrap", body_mass_g < 3000, sex == "male") ``` -```output +``` output # A tibble: 0 × 8 # ℹ 8 variables: species , island , bill_length_mm , # bill_depth_mm , flipper_length_mm , body_mass_g , sex , @@ -742,13 +742,13 @@ Filter the data so you only have observations after 2008, and from "Biscoe" isla ## Solution -```r +``` r filter(penguins, year >= 2008, island == "Biscoe") ``` -```output +``` output # A tibble: 124 × 8 species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g @@ -777,13 +777,13 @@ Filter the data so you only have observations of male penguins of the Chinstrap ## Solution -```r +``` r filter(penguins, sex == "male", species == "Chinstrap") ``` -```output +``` output # A tibble: 34 × 8 species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g @@ -809,13 +809,13 @@ filter(penguins, But what if we want all the Chinstrap penguins **or** if body mass is below 3 kilos? When we use the comma (or the &), we make sure that all statements are `TRUE`. But what if we want it so that _either_ statement is true? Then we can use the **or** character `|` . -```r +``` r filter(penguins, species == "Chinstrap" | body_mass_g < 3000) ``` -```output +``` output # A tibble: 75 × 8 species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g @@ -844,13 +844,13 @@ Filter the data so you only have observations of either male penguins or the Chi ## Solution -```r +``` r filter(penguins, sex == "male" | species == "Chinstrap") ``` -```output +``` output # A tibble: 202 × 8 species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g diff --git a/04-data-sorting-pipes.md b/04-data-sorting-pipes.md index f5f2e6ce..5bcf77a2 100644 --- a/04-data-sorting-pipes.md +++ b/04-data-sorting-pipes.md @@ -35,7 +35,7 @@ To store the changes, we have to "assign" the data to a new object in the R envi We will now store a filtered version including only the chinstrap penguins, in an object we call `chinstraps`. -```r +``` r chinstraps <- filter(penguins, species == "Chinstrap") ``` @@ -44,11 +44,11 @@ You will likely notice that when we execute this command, nothing is output to t But you should be able to see the new chinstraps object in your environment, and when we type `chinstraps` in the R console, it prints our chinstraps data. -```r +``` r chinstraps ``` -```output +``` output # A tibble: 68 × 8 species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g @@ -69,12 +69,12 @@ chinstraps Maybe in this chinstrap data we are also not interested in the bill measurements, so we want to remove them. -```r +``` r chinstraps <- select(chinstraps, -starts_with("bill")) chinstraps ``` -```output +``` output # A tibble: 68 × 6 species island flipper_length_mm body_mass_g sex year @@ -101,7 +101,7 @@ Create a new data set called "biscoe", where you only have data from "Biscoe" is ## Solution -```r +``` r biscoe <- filter(penguins, island == "Biscoe") biscoe <- select(biscoe, 1:4) ``` @@ -121,7 +121,7 @@ The shortcut to insert the pipe operator is `Ctrl`+`Shift`+`M` for Windows/Linux In the `chinstraps` example, we had the following code to filter the rows and then select our columns. -```r +``` r chinstraps <- filter(penguins, species == "Chinstrap") chinstraps <- select(chinstraps, -starts_with("bill")) ``` @@ -140,7 +140,7 @@ When reading this part, read it as follows when typing: :::::::::::::::::::::::::::::: -```r +``` r chinstraps <- penguins |> filter(species == "Chinstrap") |> select(-starts_with("bill")) @@ -161,11 +161,11 @@ Learning to read pipes is a great skill, R is not the only programming language We can do the entire pipe chain step by step to see what is happening. -```r +``` r penguins ``` -```output +``` output # A tibble: 344 × 8 species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g @@ -191,12 +191,12 @@ When reading this part, read it as follows when typing: :::::::::::::::::::::::::::::: -```r +``` r penguins |> filter(species == "Chinstrap") ``` -```output +``` output # A tibble: 68 × 8 species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g @@ -223,13 +223,13 @@ When reading this part, read it as follows when typing: :::::::::::::::::::::::::::::: -```r +``` r penguins |> filter(species == "Chinstrap") |> select(-starts_with("bill")) ``` -```output +``` output # A tibble: 68 × 6 species island flipper_length_mm body_mass_g sex year @@ -261,7 +261,7 @@ And as before, we still are seeing the output of the command chain in the consol Let us do that, again using the assignment. -```r +``` r chinstraps <- penguins |> filter(species == "Chinstrap") |> select(-starts_with("bill")) @@ -269,7 +269,7 @@ chinstraps <- penguins |> chinstraps ``` -```output +``` output # A tibble: 68 × 6 species island flipper_length_mm body_mass_g sex year @@ -294,13 +294,13 @@ Create a new data set called "biscoe", where you only have data from "Biscoe" is ## Solution -```r +``` r penguins |> filter(island == "Biscoe") |> select(1:4) ``` -```output +``` output # A tibble: 168 × 4 species island bill_length_mm bill_depth_mm @@ -333,12 +333,12 @@ When reading this part, read it as follows when typing: :::::::::::::::::::::::::::::: -```r +``` r penguins |> arrange(island) ``` -```output +``` output # A tibble: 344 × 8 species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g @@ -370,12 +370,12 @@ When reading this part, read it as follows when typing: :::::::::::::::::::::::::::::: -```r +``` r penguins |> arrange(desc(island)) ``` -```output +``` output # A tibble: 344 × 8 species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g @@ -402,12 +402,12 @@ Arrange the penguins data set by `body_mass_g`. ## Solution -```r +``` r penguins |> arrange(body_mass_g) ``` -```output +``` output # A tibble: 344 × 8 species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g @@ -435,12 +435,12 @@ penguins |> ## Solution -```r +``` r penguins |> arrange(desc(flipper_length_mm)) ``` -```output +``` output # A tibble: 344 × 8 species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g @@ -467,12 +467,12 @@ You can arrange on multiple columns! Try arranging the penguins data set by asce :::::::::::::::::::::::::::::::::::::::: solution ## Solution -```r +``` r penguins |> arrange(island, desc(flipper_length_mm)) ``` -```output +``` output # A tibble: 344 × 8 species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g @@ -509,13 +509,13 @@ When reading this part, read it as follows when typing: :::::::::::::::::::::::::::::: -```r +``` r penguins |> arrange(island) |> select(where(is.numeric)) ``` -```output +``` output # A tibble: 344 × 5 bill_length_mm bill_depth_mm flipper_length_mm body_mass_g year @@ -545,14 +545,14 @@ When reading this part, read it as follows when typing: :::::::::::::::::::::::::::::: -```r +``` r penguins |> arrange(island) |> select(island, where(is.numeric)) |> filter(sex == "male") ``` -```error +``` error Error in `filter()`: ℹ In argument: `sex == "male"`. Caused by error: @@ -579,14 +579,14 @@ Fix the previous code bit by applying one of the two solutions suggested. ## Solution -```r +``` r penguins |> arrange(island) |> select(sex, island, where(is.numeric)) |> filter(sex == "male") ``` -```output +``` output # A tibble: 168 × 7 sex island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g year @@ -603,14 +603,14 @@ penguins |> # ℹ 158 more rows ``` -```r +``` r penguins |> filter(sex == "male") |> arrange(island) |> select(island, where(is.numeric)) ``` -```output +``` output # A tibble: 168 × 6 island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g year diff --git a/05-data-plotting-scales.md b/05-data-plotting-scales.md index 18b81491..d7e5439e 100644 --- a/05-data-plotting-scales.md +++ b/05-data-plotting-scales.md @@ -44,7 +44,7 @@ When reading this part, read it as follows when typing: :::::::::::::::::::::::::::::: -```r +``` r penguins |> filter(sex == "male") |> ggplot(aes(bill_length_mm)) + @@ -68,7 +68,7 @@ Try geom_boxplot ## Solution -```r +``` r penguins |> filter(island == "Dream") |> ggplot(aes(x = species, y = flipper_length_mm)) + @@ -96,7 +96,7 @@ When reading this part, read it as follows when typing: :::::::::::::::::::::::::::::: -```r +``` r penguins |> filter(island == "Dream") |> ggplot(aes(x = species, y = flipper_length_mm)) + @@ -119,7 +119,7 @@ Learning the difference between using `fill` and `colour`/`color` can take a lit but in general colour gives colour to edges, while fill floods elements. -```r +``` r penguins |> filter(island == "Dream") |> ggplot(aes(x = species, y = flipper_length_mm)) + @@ -148,7 +148,7 @@ We get lots of preview options, "brewer", "continuous", "gradient", too many opt There's one called `scale_fill_grey()` let us try that one for convenience! -```r +``` r penguins |> filter(island == "Dream") |> ggplot(aes(x = species, y = flipper_length_mm)) + @@ -164,7 +164,7 @@ That won't do. Let us try something else. -```r +``` r penguins |> filter(island == "Dream") |> ggplot(aes(x = species, y = flipper_length_mm)) + @@ -190,7 +190,7 @@ Change the colours to coral and cyan There are lots of these names, [datanovia](https://www.datanovia.com/en/blog/awesome-list-of-657-r-color-names/) has a great list of them -```r +``` r penguins |> filter(island == "Dream") |> ggplot(aes(x = species, y = flipper_length_mm)) + @@ -215,7 +215,7 @@ red, blue and green. To explore colours in hexidecmial, there are lots of we res like [color-hex.com](https://www.color-hex.com/) -```r +``` r penguins |> filter(island == "Dream") |> ggplot(aes(x = species, y = flipper_length_mm)) + @@ -239,7 +239,7 @@ what did that do? The order you provide the manual colours dictate which category gets which colour. -```r +``` r penguins |> filter(island == "Dream") |> ggplot(aes(x = species, y = flipper_length_mm)) + @@ -268,14 +268,14 @@ The key difference between what we did before and this, is that the colouring sc is continuous, rather than categorical, so we need _slightly_ different versions. -```r +``` r penguins |> ggplot(aes(x = bill_depth_mm, y = bill_length_mm)) + geom_point(aes(colour = bill_length_mm)) + scale_colour_viridis_c() ``` -```warning +``` warning Warning: Removed 2 rows containing missing values or values outside the scale range (`geom_point()`). ``` @@ -283,14 +283,14 @@ Warning: Removed 2 rows containing missing values or values outside the scale ra -```r +``` r penguins |> ggplot(aes(x = bill_depth_mm, y = bill_length_mm)) + geom_point(aes(colour = bill_length_mm)) + scale_colour_gradientn(colours = c("#6597aa", "#cc6882")) ``` -```warning +``` warning Warning: Removed 2 rows containing missing values or values outside the scale range (`geom_point()`). ``` @@ -309,7 +309,7 @@ The `theme()` functions are there to help you get control over how a plot looks. There are lots of different themes to choose from, that form a great basis for all you need. -```r +``` r penguins |> ggplot(aes(x = bill_depth_mm, y = bill_length_mm)) + geom_point(aes(colour = bill_length_mm)) + @@ -317,7 +317,7 @@ penguins |> theme_minimal() ``` -```warning +``` warning Warning: Removed 2 rows containing missing values or values outside the scale range (`geom_point()`). ``` @@ -340,7 +340,7 @@ Its very handy to have a short-cut to it. -```r +``` r penguins |> ggplot(aes(x = bill_depth_mm, y = bill_length_mm)) + geom_point(aes(colour = bill_length_mm)) + @@ -348,7 +348,7 @@ penguins |> theme_classic() ``` -```warning +``` warning Warning: Removed 2 rows containing missing values or values outside the scale range (`geom_point()`). ``` @@ -371,7 +371,7 @@ be good to use if you ever dwelve into the world of [generative art](https://blo -```r +``` r penguins |> ggplot(aes(x = bill_depth_mm, y = bill_length_mm)) + geom_point(aes(colour = bill_length_mm)) + @@ -379,7 +379,7 @@ penguins |> theme_void() ``` -```warning +``` warning Warning: Removed 2 rows containing missing values or values outside the scale range (`geom_point()`). ``` diff --git a/06-data-manipulation.md b/06-data-manipulation.md index 10258008..a24e5862 100644 --- a/06-data-manipulation.md +++ b/06-data-manipulation.md @@ -32,12 +32,12 @@ Often, the data we have do not contain exactly what we need. We might need to ch In {tidyverse}, when we add new variables, we use the `mutate()` function. Just like the other {tidyverse} functions, mutate work specifically with data sets, and provides a nice shorthand for working directly with the columns in the data set. -```r +``` r penguins |> mutate(new_var = 1) ``` -```output +``` output # A tibble: 344 × 9 species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g @@ -59,7 +59,7 @@ The output of this can be hard to spot, depending on the size of the screen. Let us for convenience create a subsetted data set to work on so we can easily see what we are doing. -```r +``` r penguins_s <- penguins |> select(1:3, starts_with("bill")) ``` @@ -67,12 +67,12 @@ penguins_s <- penguins |> Lets try our command again on this new data. -```r +``` r penguins_s |> mutate(new_var = 1) ``` -```output +``` output # A tibble: 344 × 5 species island bill_length_mm bill_depth_mm new_var @@ -95,12 +95,12 @@ This is what we told `mutate()` to do! We specified a new column by name, and ga This works because its easy to assigning a single value to all rows. What if we try to give it three values? What would we expect? -```r +``` r penguins_s |> mutate(var = 1:3) ``` -```error +``` error Error in `mutate()`: ℹ In argument: `var = 1:3`. Caused by error: @@ -111,12 +111,12 @@ Here, it's failing with a mysterious message. The error is telling us that input So now we know the premises for mutate, it takes inputs that are either of the same length as there are rows in the data set or length 1. -```r +``` r penguins_s |> mutate(var = 1:344) ``` -```output +``` output # A tibble: 344 × 5 species island bill_length_mm bill_depth_mm var @@ -136,12 +136,12 @@ penguins_s |> But generally, we create new columns based on other data in the data set. So let's do a more useful example. For instance, perhaps we want to use the ratio between the bill length and depth as a measurement for a model. -```r +``` r penguins_s |> mutate(bill_ratio = bill_length_mm / bill_depth_mm) ``` -```output +``` output # A tibble: 344 × 5 species island bill_length_mm bill_depth_mm bill_ratio @@ -163,13 +163,13 @@ So, here we have asked for the ratio between bill length and depth to be calcula We can do almost anything within a `mutate()` to get the values as we want them, also use functions that exist in R to transform the data. For instance, perhaps we want to scale the variables of interest to have a mean of 0 and standard deviation of 1, which is quite common to improve statistical modelling. We can do that with the `scale()` function. -```r +``` r penguins_s |> mutate(bill_ratio = bill_length_mm / bill_depth_mm, bill_length_mm_z = scale(bill_length_mm)) ``` -```output +``` output # A tibble: 344 × 6 species island bill_length_mm bill_depth_mm bill_ratio bill_length_mm_z[,1] @@ -194,12 +194,12 @@ Create a column where bill length is transformed to cm. To transform mm to cm, y ## Solution -```r +``` r penguins_s |> mutate(bill_length_cm = bill_length_mm / 10) ``` -```output +``` output # A tibble: 344 × 5 species island bill_length_mm bill_depth_mm bill_length_cm @@ -227,12 +227,12 @@ Create a column for body mass in kilos, rather than grams, in the main penguins ## Solution -```r +``` r penguins |> mutate(body_mass_kg = body_mass_g / 1000) ``` -```output +``` output # A tibble: 344 × 9 species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g @@ -264,7 +264,7 @@ The `if_else()` function takes expressions, much like `filter()`. The first value after the expression is the value assigned if the expression is `TRUE`, while the second is if the expression is `FALSE` -```r +``` r penguin_weight <- penguins |> select(year, body_mass_g) @@ -274,7 +274,7 @@ penguin_weight |> false = "normal")) ``` -```output +``` output # A tibble: 344 × 3 year body_mass_g size @@ -296,7 +296,7 @@ Now we have a column with two values, `large` and `normal` based on whether the We can for instance use that in a plot. -```r +``` r penguin_weight |> mutate(size = if_else(condition = body_mass_g > 4500, true = "large", @@ -305,7 +305,7 @@ penguin_weight |> geom_jitter(mapping = aes(x = year, y = body_mass_g, colour = size)) ``` -```warning +``` warning Warning: Removed 2 rows containing missing values or values outside the scale range (`geom_point()`). ``` @@ -324,7 +324,7 @@ Oh, it's starting to sound complicated, but it does not have to be! `mutate()` is so-called lazy-evaluated. This sounds weird, but it means that each new column you make is made in the sequence you make them. So as long as you think about the order of your `mutate()` creations, you can do that in a single mutate call. -```r +``` r penguins_s |> mutate( bill_ratio = bill_depth_mm / bill_length_mm, @@ -334,7 +334,7 @@ penguins_s |> ) ``` -```output +``` output # A tibble: 344 × 6 species island bill_length_mm bill_depth_mm bill_ratio bill_type @@ -356,7 +356,7 @@ Now you've created two variables. One for `bill_ratio`, and then another one con If you switched the order of these two, R would produce an error, because there would be no bill ratio to create the other column. -```r +``` r penguins_s |> mutate( bill_ratio = bill_depth_mm / bill_length_mm, @@ -367,7 +367,7 @@ penguins_s |> ) ``` -```output +``` output # A tibble: 344 × 6 species island bill_length_mm bill_depth_mm bill_ratio bill_type @@ -387,7 +387,7 @@ penguins_s |> But what if we want to categorize based on more than one condition? Nested `if_else()`? -```r +``` r penguins_s |> mutate( bill_ratio = bill_depth_mm / bill_length_mm, @@ -398,7 +398,7 @@ penguins_s |> false = "stumped"))) ``` -```output +``` output # A tibble: 344 × 6 species island bill_length_mm bill_depth_mm bill_ratio bill_type @@ -420,7 +420,7 @@ Thankfully, {dplyr} has a smarter way of doing this, called `case_when()`. This On the left you have the logical expression, and the on the right of the tilde (`~`) is the value to be assigned if that expression is `TRUE` -```r +``` r penguins_s |> mutate( bill_ratio = bill_depth_mm / bill_length_mm, @@ -435,7 +435,7 @@ penguins_s |> geom_point() ``` -```warning +``` warning Warning: Removed 2 rows containing missing values or values outside the scale range (`geom_point()`). ``` @@ -448,7 +448,7 @@ That looks almost the same. The `NA`'s are gone! That's not right. We cannot cat `case_when()`, like the `mutate()`, evaluates the expressions in sequence. Which is why we can have two statements evaluating the same column with similar expressions (below 0.35 and then below 0.45). All values that are below 0.45 are also below 0.35. Since we first assign everything below 0.35, and then below 0.45, they do not collide. We can do the same for our last statement, saying that all values that are not `NA` should be given this category. -```r +``` r penguins |> mutate( bill_ratio = bill_depth_mm / bill_length_mm, @@ -463,7 +463,7 @@ penguins |> geom_point() ``` -```warning +``` warning Warning: Removed 2 rows containing missing values or values outside the scale range (`geom_point()`). ``` @@ -480,12 +480,12 @@ Create a column named `bill_ld_ratio_log` that is the natural logarithm (using t ## Solution -```r +``` r penguins |> mutate(bill_ld_ratio_log = log(bill_length_mm / bill_depth_mm)) ``` -```output +``` output # A tibble: 344 × 9 species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g @@ -514,7 +514,7 @@ Create a new column called `body_type`, where animals below 3 kg are `small`, an ## Solution -```r +``` r penguins |> mutate( body_type = case_when( @@ -527,7 +527,7 @@ penguins |> ) ``` -```output +``` output # A tibble: 344 × 10 species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g diff --git a/07-data-reshaping.md b/07-data-reshaping.md index 26302a05..c18bd358 100644 --- a/07-data-reshaping.md +++ b/07-data-reshaping.md @@ -63,12 +63,12 @@ As before, we need to start off by making sure we have the tidyverse package loa In tidyverse, there is a single function to create longer data sets, called `pivot_longer`. Those of you who might have some prior experience with tidyverse, or you might encounter it when googling for help, might have seen the `gather` function. This is an older function of similar capabilities which we will not cover here, as the `pivot_longer` function supersedes it. -```r +``` r penguins |> pivot_longer(contains("_")) ``` -```output +``` output # A tibble: 1,376 × 6 species island sex year name value @@ -91,7 +91,7 @@ So before, the data was wider, in that each of the columns with `_` had their ow Why would we want to do that? Well, perhaps we want to plot all the variables in a single ggplot call? Now that the measurement types are collected in these two ways, we can facet over the `name` column to create a sub-plot per measurement type! -```r +``` r penguins |> pivot_longer(contains("_")) |> ggplot(aes(y = value, @@ -101,7 +101,7 @@ penguins |> facet_wrap(~name, scales = "free_y") ``` -```warning +``` warning Warning: Removed 8 rows containing non-finite outside the scale range (`stat_boxplot()`). ``` @@ -118,12 +118,12 @@ Pivot longer all columns ending with "mm" . ## Solution -```r +``` r penguins |> pivot_longer(ends_with("mm")) ``` -```output +``` output # A tibble: 1,032 × 7 species island body_mass_g sex year name value @@ -151,12 +151,12 @@ Pivot the penguins data so that all the bill measurements are in the same column ## Solution -```r +``` r penguins |> pivot_longer(starts_with("bill")) ``` -```output +``` output # A tibble: 688 × 8 species island flipper_length_mm body_mass_g sex year name value @@ -184,12 +184,12 @@ As mentioned, pivot_longer accepts tidy-selectors. Pivot longer all numerical co ## Solution -```r +``` r penguins |> pivot_longer(where(is.numeric)) ``` -```output +``` output # A tibble: 1,720 × 5 species island sex name value @@ -215,14 +215,14 @@ penguins |> While often you can get away with leaving the default naming of the two columns as is, especially if you are just doing something quick like making a plot, most times you will likely want to control the names of your two new columns. -```r +``` r penguins |> pivot_longer(contains("_"), names_to = "columns", values_to = "content") ``` -```output +``` output # A tibble: 1,376 × 6 species island sex year columns content @@ -244,14 +244,14 @@ Here, we change the "names" to "columns" and "values" to "content". The pivot de But we have even more power in the renaming of columns. Pivots actually have quite a lot of options, making it possible for us to create outputs looking just like we want. Notice how the names of the columns we pivoted follow a specific structure. First is the name of the body part, then the type of measurement, then the unit of the measurement. This clear logic we can use to our advantage. -```r +``` r penguins |> pivot_longer(contains("_"), names_to = c("part", "measure" , "unit"), names_sep = "_") ``` -```output +``` output # A tibble: 1,376 × 8 species island sex year part measure unit value @@ -278,14 +278,14 @@ Pivot longer all the bill measurements, and alter the names in one go, so that t ## Solution -```r +``` r penguins |> pivot_longer(starts_with("bill"), names_to = c("part", "measure" , "unit"), names_sep = "_") ``` -```output +``` output # A tibble: 688 × 10 species island flipper_length_mm body_mass_g sex year part measure unit @@ -314,13 +314,13 @@ Pivot longer all the bill measurements, and use the `names_prefix` argument. Giv ## Solution -```r +``` r penguins |> pivot_longer(starts_with("bill"), names_prefix = "bill_") ``` -```output +``` output # A tibble: 688 × 8 species island flipper_length_mm body_mass_g sex year name value @@ -348,7 +348,7 @@ Pivot longer all the bill measurements, and use the `names_prefix`, `names_to` a ## Solution -```r +``` r penguins |> pivot_longer(starts_with("bill"), names_prefix = "bill_", @@ -356,7 +356,7 @@ penguins |> names_sep = "_") ``` -```output +``` output # A tibble: 688 × 9 species island flipper_length_mm body_mass_g sex year bill_measure unit @@ -384,13 +384,13 @@ When pivoting, it is common that quite some `NA` values appear in the values col We can remove these immediately by making the argument `values_drop_na` be `TRUE` -```r +``` r penguins |> pivot_longer(starts_with("bill"), values_drop_na = TRUE) ``` -```output +``` output # A tibble: 684 × 8 species island flipper_length_mm body_mass_g sex year name value @@ -412,7 +412,7 @@ This extra argument will ensure that all `NA` values in the `value` column are r We should put everything together and create a new object that is our long formatted penguin data set. -```r +``` r penguins_long <- penguins |> pivot_longer(contains("_"), names_to = c("part", "measure" , "unit"), @@ -421,7 +421,7 @@ penguins_long <- penguins |> penguins_long ``` -```output +``` output # A tibble: 1,368 × 8 species island sex year part measure unit value @@ -446,13 +446,13 @@ You will also experience that this skill can be convenient when creating data su Before we start using the penguins_longer dataset we made, let us make another simpler longer data set, for the first look a the pivor wider function. -```r +``` r penguins_long_simple <- penguins |> pivot_longer(contains("_")) penguins_long_simple ``` -```output +``` output # A tibble: 1,376 × 6 species island sex year name value @@ -474,13 +474,13 @@ penguins_long_simple If we want to make this wider again we can try the following: -```r +``` r penguins_long_simple |> pivot_wider(names_from = name, values_from = value) ``` -```warning +``` warning Warning: Values from `value` are not uniquely identified; output will contain list-cols. • Use `values_fn = list` to suppress this warning. • Use `values_fn = {summary_fun}` to summarise duplicates. @@ -491,7 +491,7 @@ Warning: Values from `value` are not uniquely identified; output will contain li dplyr::filter(n > 1L) ``` -```output +``` output # A tibble: 35 × 8 species island sex year bill_length_mm bill_depth_mm flipper_length_mm @@ -516,11 +516,11 @@ Lets look at the warning message our code gave us and see if we can figure it ou yikes! That's super annoying. Let's go back to our penguins data set and see if we can do something to help. -```r +``` r penguins ``` -```output +``` output # A tibble: 344 × 8 species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g @@ -543,14 +543,14 @@ We can remedy that by adding row numbers to the original data before we pivot. T By doing a mutate adding the row number to the data set, we should then have a clear variable identifying each observation. -```r +``` r penguins_long_simple <- penguins |> mutate(sample = row_number()) |> pivot_longer(contains("_")) penguins_long_simple ``` -```output +``` output # A tibble: 1,376 × 7 species island sex year sample name value @@ -577,13 +577,13 @@ Turn the penguins_long_simple dataset back to its original state ## Solution -```r +``` r penguins_long_simple |> pivot_wider(names_from = name, values_from = value) ``` -```output +``` output # A tibble: 344 × 9 species island sex year sample bill_length_mm bill_depth_mm @@ -611,7 +611,7 @@ And now it worked! Now, the remaining columns were able to uniquely identify whi We should re-create our penguins long data set, to make sure we don't have this problem again. -```r +``` r penguins_long <- penguins |> mutate(sample = row_number()) |> pivot_longer(contains("_"), @@ -621,7 +621,7 @@ penguins_long <- penguins |> penguins_long ``` -```output +``` output # A tibble: 1,368 × 9 species island sex year sample part measure unit value @@ -642,14 +642,14 @@ Much as the first example of pivot_longer, pivot_wider in its simplest form is r Like pivot_longer, pivot_wider has arguments that will let us get back to the original state, with much of the same syntax as with pivot_longer! -```r +``` r penguins_long |> pivot_wider(names_from = c("part", "measure", "unit"), names_sep = "_", values_from = value) ``` -```output +``` output # A tibble: 342 × 9 species island sex year sample bill_length_mm bill_depth_mm diff --git a/08-data-summaries.md b/08-data-summaries.md index 0e936b6f..eeaf0cd1 100644 --- a/08-data-summaries.md +++ b/08-data-summaries.md @@ -33,7 +33,7 @@ Let us start from the beginning with summaries, and work our way up to the more First, we must again prepare our workspace with our packages and data. -```r +``` r library(tidyverse) penguins <- palmerpenguins::penguins ``` @@ -41,12 +41,12 @@ penguins <- palmerpenguins::penguins We should start to feel quite familiar with our penguins by now. Let us start by finding the mean of the bill length -```r +``` r penguins |> summarise(bill_length_mean = mean(bill_length_mm)) ``` -```output +``` output # A tibble: 1 × 1 bill_length_mean @@ -61,12 +61,12 @@ Right now, we will ignore those. We can omit these by adding the `na.rm = TRUE` argument, which will remove all `NA`'s before calculating the mean. -```r +``` r penguins |> summarise(bill_length_mean = mean(bill_length_mm, na.rm = TRUE)) ``` -```output +``` output # A tibble: 1 × 1 bill_length_mean @@ -76,13 +76,13 @@ penguins |> An alternative way to remove missing values from a column is to pass the column to {tidyr}'s `drop_na()` function. -```r +``` r penguins |> drop_na(bill_length_mm) |> summarise(bill_length_mean = mean(bill_length_mm)) ``` -```output +``` output # A tibble: 1 × 1 bill_length_mean @@ -91,7 +91,7 @@ penguins |> -```r +``` r penguins |> drop_na(bill_length_mm) |> summarise(bill_length_mean = mean(bill_length_mm), @@ -99,7 +99,7 @@ penguins |> bill_length_max = max(bill_length_mm)) ``` -```output +``` output # A tibble: 1 × 3 bill_length_mean bill_length_min bill_length_max @@ -114,13 +114,13 @@ penguins |> ## Solution -```r +``` r penguins |> drop_na(body_mass_g) |> summarise(body_mass_kg_mean = mean(body_mass_g / 1000)) ``` -```output +``` output # A tibble: 1 × 1 body_mass_kg_mean @@ -137,7 +137,7 @@ penguins |> :::::::::::::::::::::::::::::::::::::::: solution ## Solution -```r +``` r penguins |> drop_na(body_mass_g) |> summarise( @@ -146,7 +146,7 @@ penguins |> ) ``` -```output +``` output # A tibble: 1 × 2 body_mass_kg_mean body_mass_kg_sd @@ -165,7 +165,7 @@ penguins |> ## Solution -```r +``` r penguins |> drop_na(body_mass_g, flipper_length_mm) |> summarise( @@ -176,7 +176,7 @@ penguins |> ) ``` -```output +``` output # A tibble: 1 × 4 body_mass_kg_mean body_mass_kg_sd flipper_length_cm_mean flipper_length_cm_sd @@ -202,13 +202,13 @@ Because, once we know how to summarize data, summarizing data by groups is as si Let us start with our first example of getting the mean of a single column. -```r +``` r penguins |> drop_na(body_mass_g) |> summarise(body_mass_g_mean = mean(body_mass_g)) ``` -```output +``` output # A tibble: 1 × 1 body_mass_g_mean @@ -219,14 +219,14 @@ Here, we are getting a single mean for the entire data set. In order to get, for instance the means of each of the species, we can group the data set by species before we summarize. -```r +``` r penguins |> drop_na(body_mass_g) |> group_by(species) |> summarise(body_mass_kg_mean = mean(body_mass_g / 1000)) ``` -```output +``` output # A tibble: 3 × 2 species body_mass_kg_mean @@ -239,7 +239,7 @@ And now we suddenly have three means! And they are tidily collected in each thei To this we can keep adding as we did before. -```r +``` r penguins |> drop_na(body_mass_g) |> group_by(species) |> @@ -250,7 +250,7 @@ penguins |> ) ``` -```output +``` output # A tibble: 3 × 4 species body_mass_kg_mean body_mass_kg_min body_mass_kg_max @@ -269,13 +269,13 @@ But what if you want a really quick count of all the records in different groups One way, would be to use the summarise function together with the `n()` function, which counts the number of rows in each group. -```r +``` r penguins |> group_by(species) |> summarise(n = n()) ``` -```output +``` output # A tibble: 3 × 2 species n @@ -289,13 +289,13 @@ But if all you want is the frequency table, we would suggest using the functions They are synonymous in what they do, so you can choose the one that feels more appropriate. -```r +``` r penguins |> group_by(species) |> tally() ``` -```output +``` output # A tibble: 3 × 2 species n @@ -304,13 +304,13 @@ penguins |> 3 Gentoo 124 ``` -```r +``` r penguins |> group_by(species) |> count() ``` -```output +``` output # A tibble: 3 × 2 # Groups: species [3] species n @@ -330,7 +330,7 @@ These are two really nice convenience functions for getting a quick frequency ta ## Solution -```r +``` r penguins |> drop_na(bill_length_mm) |> group_by(island) |> @@ -340,7 +340,7 @@ penguins |> ) ``` -```output +``` output # A tibble: 3 × 3 island bill_length_mm_mean bill_length_mm_sd @@ -361,7 +361,7 @@ penguins |> ## Solution -```r +``` r penguins |> drop_na(bill_length_mm) |> group_by(island, sex) |> @@ -371,12 +371,12 @@ penguins |> ) ``` -```output +``` output `summarise()` has grouped output by 'island'. You can override using the `.groups` argument. ``` -```output +``` output # A tibble: 9 × 4 # Groups: island [3] island sex bill_length_mm_mean bill_length_mm_sd @@ -404,13 +404,13 @@ But in many cases we might continue our merry data handling way and do lots more preserving of the grouping can give us some unexpected results. Let us explore that a little. -```r +``` r penguins |> group_by(species) |> count() ``` -```output +``` output # A tibble: 3 × 2 # Groups: species [3] species n @@ -425,13 +425,13 @@ In a way, the `summarize()` uses up one group while summarizing, as based on spe When we group by two columns, it actually has the same behavior. -```r +``` r penguins |> group_by(species, island) |> count() ``` -```output +``` output # A tibble: 5 × 3 # Groups: species, island [5] species island n @@ -449,14 +449,14 @@ Lets say we want a column now, that counts the total number of penguins observat That would be the sum of the "n" column. -```r +``` r penguins |> group_by(species, island) |> count() |> mutate(total = sum(n)) ``` -```output +``` output # A tibble: 5 × 4 # Groups: species, island [5] species island n total @@ -471,7 +471,7 @@ penguins |> But that is not what we are expecting! why? Because the data is still grouped by species, it is now taking the sum within each species, rather than the whole. To get the whole we need first to `ungroup()`, and then try again. -```r +``` r penguins |> group_by(species, island) |> count() |> @@ -479,7 +479,7 @@ penguins |> mutate(total = sum(n)) ``` -```output +``` output # A tibble: 5 × 4 species island n total @@ -499,7 +499,7 @@ then add another column that has the mean for all the data ## Solution -```r +``` r penguins |> drop_na(bill_length_mm) |> group_by(island, sex) |> @@ -511,12 +511,12 @@ penguins |> mutate(mean = mean(bill_length_mm_mean)) ``` -```output +``` output `summarise()` has grouped output by 'island'. You can override using the `.groups` argument. ``` -```output +``` output # A tibble: 9 × 5 island sex bill_length_mm_mean bill_length_mm_sd mean @@ -544,7 +544,7 @@ But, often, grouping before data manipulation can unlock great new possibilities Let us use the data we made where we summarised the body mass of penguins in kilograms, and let us group by species and sex. -```r +``` r penguins |> drop_na(body_mass_g) |> group_by(species, sex) |> @@ -555,12 +555,12 @@ penguins |> ) ``` -```output +``` output `summarise()` has grouped output by 'species'. You can override using the `.groups` argument. ``` -```output +``` output # A tibble: 8 × 5 # Groups: species [3] species sex body_mass_kg_mean body_mass_kg_min body_mass_kg_max @@ -581,7 +581,7 @@ We would need the species mean, in addition to the species sex means. We can add this, as the data is already grouped by sex, with a mutate. -```r +``` r penguins |> drop_na(body_mass_g) |> group_by(species, sex) |> @@ -595,12 +595,12 @@ penguins |> ) ``` -```output +``` output `summarise()` has grouped output by 'species'. You can override using the `.groups` argument. ``` -```output +``` output # A tibble: 8 × 6 # Groups: species [3] species sex body_mass_kg_mean body_mass_kg_min body_mass_kg_max species_mean @@ -621,7 +621,7 @@ So, in the same data set, we have everything we need to calculate the relative d -```r +``` r penguins |> drop_na(body_mass_g) |> group_by(species, sex) |> @@ -636,12 +636,12 @@ penguins |> ) ``` -```output +``` output `summarise()` has grouped output by 'species'. You can override using the `.groups` argument. ``` -```output +``` output # A tibble: 8 × 7 # Groups: species [3] species sex body_mass_kg_mean body_mass_kg_min body_mass_kg_max species_mean @@ -667,7 +667,7 @@ Calculate the difference in flipper length between the different species of peng ## Solution -```r +``` r penguins |> drop_na(flipper_length_mm) |> group_by(species) |> @@ -680,7 +680,7 @@ penguins |> ) ``` -```output +``` output # A tibble: 3 × 4 species flipper_mean species_mean flipper_species_diff @@ -700,7 +700,7 @@ Calculate the difference in flipper length between the different species of peng ## Solution -```r +``` r penguins |> drop_na(flipper_length_mm) |> group_by(species, sex) |> @@ -713,12 +713,12 @@ penguins |> ) ``` -```output +``` output `summarise()` has grouped output by 'species'. You can override using the `.groups` argument. ``` -```output +``` output # A tibble: 8 × 5 # Groups: species [3] species sex flipper_mean species_mean flipper_species_diff diff --git a/09-data-complex-pipelines.md b/09-data-complex-pipelines.md index da32505c..19a2d481 100644 --- a/09-data-complex-pipelines.md +++ b/09-data-complex-pipelines.md @@ -34,12 +34,12 @@ We've learned about summaries and grouped summaries. Can you think of a way we can do that using the things we've learned? -```r +``` r penguins |> pivot_longer(contains("_")) ``` -```output +``` output # A tibble: 1,376 × 6 species island sex year name value @@ -60,14 +60,14 @@ We've done this before, why is it a clue now? Now that we have learned grouping what if we now also group by the new name column to get summaries for each column as a row already here! -```r +``` r penguins |> pivot_longer(contains("_")) |> group_by(name) |> summarise(mean = mean(value, na.rm = TRUE)) ``` -```output +``` output # A tibble: 4 × 2 name mean @@ -79,7 +79,7 @@ penguins |> Now we are talking! Now we have the mean of each of our observational columns! Lets add other common summary statistics. -```r +``` r penguins |> pivot_longer(contains("_")) |> group_by(name) |> @@ -91,7 +91,7 @@ penguins |> ) ``` -```output +``` output # A tibble: 4 × 5 name mean sd min max @@ -104,7 +104,7 @@ penguins |> That's a pretty neat table! The repetition of `na.rm = TRUE` in all is a little tedious, though. Let us use an extra argument in the pivot longer to remove `NA`s in the value column -```r +``` r penguins |> pivot_longer(contains("_")) |> drop_na(value) |> @@ -117,7 +117,7 @@ penguins |> ) ``` -```output +``` output # A tibble: 4 × 5 name mean sd min max @@ -141,7 +141,7 @@ Try the `n()` function. ## Solution -```r +``` r penguins |> pivot_longer(contains("_")) |> drop_na(value) |> @@ -155,7 +155,7 @@ penguins |> ) ``` -```output +``` output # A tibble: 4 × 6 name mean sd min max n @@ -177,7 +177,7 @@ Try grouping by more variables, like species and island, is the output what you ## Solution -```r +``` r penguins |> pivot_longer(contains("_")) |> drop_na(value) |> @@ -191,12 +191,12 @@ penguins |> ) ``` -```output +``` output `summarise()` has grouped output by 'name', 'species'. You can override using the `.groups` argument. ``` -```output +``` output # A tibble: 20 × 8 # Groups: name, species [12] name species island mean sd min max n @@ -235,7 +235,7 @@ but for all numerical variables. Grouped only by the variable names. ## Solution -```r +``` r penguins |> pivot_longer(where(is.numeric)) |> drop_na(value) |> @@ -249,7 +249,7 @@ penguins |> ) ``` -```output +``` output # A tibble: 5 × 6 name mean sd min max n @@ -270,7 +270,7 @@ Now that we have the summaries, we can use them in plots too! But keep typing or So let us save the summary in its own object, and keep using that. -```r +``` r penguins_sum <- penguins |> pivot_longer(contains("_")) |> drop_na(value) |> @@ -285,7 +285,7 @@ penguins_sum <- penguins |> ungroup() ``` -```output +``` output `summarise()` has grouped output by 'name', 'species'. You can override using the `.groups` argument. ``` @@ -293,7 +293,7 @@ the `.groups` argument. We can for instance make a bar chart with the values from the summary statistics. -```r +``` r penguins_sum |> ggplot(aes(x = island, y = mean, @@ -308,7 +308,7 @@ oh, but the points are stacking on top of each other and are hard to see. T -```r +``` r penguins_sum |> ggplot(aes(x = island, y = mean, @@ -336,7 +336,7 @@ Use facet_wrap() ## Solution -```r +``` r penguins_sum |> ggplot(aes(x = island, y = sd, @@ -360,7 +360,7 @@ Why is this plot misleading? ## Solution -```r +``` r penguins_sum |> ggplot(aes(x = species, y = sd, @@ -383,7 +383,7 @@ To do that, we add the `geom_errorbar()` function to the ggplot calls. `geom_err In our case, it would be the mean - sd, for minimum, and the mean + sd for the maximum. -```r +``` r penguins_sum |> ggplot(aes(x = island, y = mean, @@ -402,7 +402,7 @@ Right, so now we have error bars, but they dont connect to the dots! Perhaps we can dodge those too? -```r +``` r penguins_sum |> ggplot(aes(x = island, y = mean, @@ -427,7 +427,7 @@ Try adjusting them by setting the width argument to 0.3 ## Solution -```r +``` r penguins_sum |> ggplot(aes(x = island, y = mean, @@ -453,7 +453,7 @@ But we can get even more creative! Lets recreate our summary table, and add year as a grouping, so we can get an idea of how the measurements change over time. -```r +``` r penguins_sum <- penguins |> pivot_longer(contains("_")) |> drop_na(value) |> @@ -468,16 +468,16 @@ penguins_sum <- penguins |> ungroup() ``` -```output +``` output `summarise()` has grouped output by 'name', 'species', 'island'. You can override using the `.groups` argument. ``` -```r +``` r penguins_sum ``` -```output +``` output # A tibble: 60 × 9 name species island year mean sd min max n @@ -497,7 +497,7 @@ penguins_sum And then let us re-create our last plot with this new summary table. -```r +``` r penguins_sum |> ggplot(aes(x = island, y = mean, @@ -529,7 +529,7 @@ Try adjusting them by setting the width argument to 0.3 ## Solution -```r +``` r penguins_sum |> ggplot(aes(x = island, y = mean, @@ -555,7 +555,7 @@ Lets switch that up. -```r +``` r penguins_sum |> ggplot(aes(x = year, y = mean, @@ -581,7 +581,7 @@ And you can think of it like `rows ~ columns`. So here we are saying we want the `island` values as rows, and `name` values as columns in the plot grid. -```r +``` r penguins_sum |> ggplot(aes(x = year, y = mean, @@ -608,7 +608,7 @@ Was this the effect you expected? ## Solution -```r +``` r penguins_sum |> ggplot(aes(x = year, y = mean, @@ -635,7 +635,7 @@ Try switching up what is plotted as rows and columns in the facet. Does this hel :::::::::::::::::::::::::::::::::::::::: solution ## Solution -```r +``` r penguins_sum |> ggplot(aes(x = year, y = mean, @@ -669,7 +669,7 @@ Saving a ggplot object is just like saving a dataset object. We have to assign it a name at the beginning. -```r +``` r penguins_plot <- penguins_sum |> ggplot(aes(x = year, y = mean, @@ -690,7 +690,7 @@ Just like when you assign a data set it wont show in the console, when you assig To re-initiate the plot in the plot pane, write its name in the console and press enter. -```r +``` r penguins_plot ``` @@ -700,7 +700,7 @@ From there, we can keep adding more ggplot geoms or facets etc. In this first version, we will add a "theme". A theme is a change of the overall "look" of the plot. -```r +``` r penguins_plot + theme_classic() ``` @@ -709,7 +709,7 @@ penguins_plot + the classic theme is preferred by many journals, but for facet grid, its not super nice, since we loose grid information. -```r +``` r penguins_plot + theme_light() ``` @@ -718,7 +718,7 @@ penguins_plot + Theme light could be a nice option, but the white text of light grey makes the panel text hard to read. -```r +``` r penguins_plot + theme_dark() ``` @@ -745,7 +745,7 @@ What themes did you find that you liked? We are going to have a go at `theme_linedraw` which has a simple but clear design. -```r +``` r penguins_plot + theme_linedraw() ``` @@ -756,7 +756,7 @@ Now that we have a theme, we can have a look at changing the colours of the poin We do this through something called "scales". -```r +``` r penguins_plot + theme_linedraw() + scale_colour_brewer(palette = "Dark2") @@ -770,7 +770,7 @@ THe brewer palettes are a curated library of colour palettes to choose from in g You can have a peak at all possible brewer palettes by typing -```r +``` r RColorBrewer::display.brewer.all() ``` @@ -784,7 +784,7 @@ Try another brewer palette by replacing the palette name with another in the bre ## Solution -```r +``` r penguins_plot + theme_linedraw() + scale_colour_brewer(palette = "Accent") @@ -801,7 +801,7 @@ Apply the dark theme in stead, and a pastel colour palette. :::::::::::::::::::::::::::::::::::::::: solution ## Solution -```r +``` r penguins_plot + theme_dark() + scale_colour_brewer(palette = "Pastel2") diff --git a/10-data-manipulation-across.md b/10-data-manipulation-across.md index ae48ca6d..427f2caa 100644 --- a/10-data-manipulation-across.md +++ b/10-data-manipulation-across.md @@ -31,7 +31,7 @@ Let us also make a subsetted sample with just the bill measurements so we cab ea We can do that in the following way. -```r +``` r penguins_s <- penguins |> select(species, starts_with("bill")) @@ -41,7 +41,7 @@ penguins_s |> ) ``` -```output +``` output # A tibble: 344 × 4 species bill_length_mm bill_depth_mm bill_sum @@ -65,12 +65,12 @@ We have a special type of operations we can do to get that easily. We will use the function `sum` to calculate the sum of several variables when using this pipeline. -```r +``` r penguins_s |> mutate(bill_sum = sum(c_across(starts_with("bill")))) ``` -```output +``` output # A tibble: 344 × 4 species bill_length_mm bill_depth_mm bill_sum @@ -96,13 +96,13 @@ We can apply a function called `rowwise()` which is a special type of `group_by` Then, `c_across()` will calculate the mean of the columns just for that group (i.e. row in this case). -```r +``` r penguins_s |> rowwise() |> mutate(bill_sum = sum(c_across(starts_with("bill")))) ``` -```output +``` output # A tibble: 344 × 4 # Rowwise: species bill_length_mm bill_depth_mm bill_sum @@ -124,14 +124,14 @@ Now we can see that we get the row sum of all the bill columns for each row, and To stop the data set being rowwise, we can use the `ungroup()` function we learned before. -```r +``` r penguins_s |> rowwise() |> mutate(bill_sum = sum(c_across(starts_with("bill")))) |> ungroup() ``` -```output +``` output # A tibble: 344 × 4 species bill_length_mm bill_depth_mm bill_sum @@ -156,7 +156,7 @@ Calculate the mean of all the columns with millimeter measurements, an call it ` ## Solution -```r +``` r penguins |> rowwise() |> mutate( @@ -164,7 +164,7 @@ penguins |> ) ``` -```output +``` output # A tibble: 344 × 9 # Rowwise: species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g @@ -196,7 +196,7 @@ Ignore `NA`s in the last calculation ## Solution -```r +``` r penguins |> rowwise() |> mutate( @@ -206,7 +206,7 @@ penguins |> mutate(mm_mean_species = mean(mm_mean, na.rm = TRUE)) ``` -```output +``` output # A tibble: 344 × 10 # Groups: species [3] species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g @@ -243,7 +243,7 @@ We've already used the `scale()` function once before, so we will do it again. In this simple example we might have done so: -```r +``` r penguins |> mutate( bill_depth_sc = scale(bill_depth_mm), @@ -252,7 +252,7 @@ penguins |> ) ``` -```output +``` output # A tibble: 344 × 11 species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g @@ -278,13 +278,13 @@ We are only human, we easily make mistakes. With {dplyr}'s `across()` we can combine our knowledge of tidy-selectors and mutate to create the entire transformation for these columns at once. -```r +``` r penguins |> mutate(across(.cols = ends_with("mm"), .fns = scale)) ``` -```output +``` output # A tibble: 344 × 8 species island bill_length_mm[,1] bill_depth_mm[,1] flipper_length_mm[,1] @@ -310,7 +310,7 @@ But oh no! The columns have been overwritten. Rather than creating new ones, we This might be your intention in some instances, or maybe you will just create a new data set with the scaled variables. -```r +``` r penguins_mm_sc <- penguins |> mutate(across(.cols = ends_with("mm"), .fns = scale)) @@ -319,7 +319,7 @@ penguins_mm_sc <- penguins |> but often, we'd like to keep the original but add the new variants. We can do that to within the across! -```r +``` r penguins |> mutate(across(.cols = ends_with("mm"), .fns = scale, @@ -327,7 +327,7 @@ penguins |> select(contains("mm")) ``` -```output +``` output # A tibble: 344 × 6 bill_length_mm bill_depth_mm flipper_length_mm bill_length_mm_sc[,1] @@ -360,14 +360,14 @@ Transform all the colmns with an underscore in their name so they are scaled, an ## Solution -```r +``` r penguins |> mutate(across(.cols = contains("_"), .fns = scale, .names = "sc_{.col}")) ``` -```output +``` output # A tibble: 344 × 12 species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g @@ -403,7 +403,7 @@ You can add a standard mutate within the same mutate as across ## Solution -```r +``` r penguins |> mutate( across(.cols = contains("_"), @@ -413,7 +413,7 @@ penguins |> ) ``` -```output +``` output # A tibble: 344 × 13 species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g diff --git a/config.yaml b/config.yaml deleted file mode 100644 index 1d293893..00000000 --- a/config.yaml +++ /dev/null @@ -1,84 +0,0 @@ -#------------------------------------------------------------ -# Values for this lesson. -#------------------------------------------------------------ - -# Which carpentry is this (swc, dc, lc, or cp)? -# swc: Software Carpentry -# dc: Data Carpentry -# lc: Library Carpentry -# cp: Carpentries (to use for instructor traning for instance) -carpentry: incubator - -# Overall title for pages. -title: R and the Tidyverse for working with data - -# Date the lesson was created (this is empty by default) -created: ~ - -# Comma-separated list of keywords for the lesson -keywords: software, data, lesson, The Carpentries, R, tidyverse - -# Life cycle stage of the lesson -# possible values: pre-alpha, alpha, beta, stable -life_cycle: alpha - -# License of the lesson -license: CC-BY 4.0 - -# Link to the source repository for this lesson -source: https://github.com/athanasiamo/r-tidyverse-4-datasets - -# Default branch of your lesson -branch: main - -# Who to contact if there are any issues -contact: team@carpentries.org - -# Navigation ------------------------------------------------ -# -# Use the following menu items to specify the order of -# individual pages in each dropdown section. Leave blank to -# include all pages in the folder. -# -# Example ------------- -# -# episodes: -# - introduction.md -# - first-steps.md -# -# learners: -# - setup.md -# -# instructors: -# - instructor-notes.md -# -# profiles: -# - one-learner.md -# - another-learner.md - -# Order of episodes in your lesson -episodes: - - 01-project-introduction.Rmd - - 02-data-visualisation.Rmd - - 03-data-subsetting.Rmd - - 04-data-sorting-pipes.Rmd - - 05-data-plotting-scales.Rmd - - 06-data-manipulation.Rmd - - 07-data-reshaping.Rmd - - 08-data-summaries.Rmd - - 09-data-complex-pipelines.Rmd - - 10-data-manipulation-across.Rmd - -# Information for Learners -learners: -- setup.md - -# Information for Instructors -instructors: -- instructor-notes.md - -# Learner Profiles -profiles: -- learner-profiles.md - - diff --git a/fig/02-data-visualisation-rendered-unnamed-chunk-10-1.png b/fig/02-data-visualisation-rendered-unnamed-chunk-10-1.png index 34ec75b0..10f8cec2 100644 Binary files a/fig/02-data-visualisation-rendered-unnamed-chunk-10-1.png and b/fig/02-data-visualisation-rendered-unnamed-chunk-10-1.png differ diff --git a/fig/02-data-visualisation-rendered-unnamed-chunk-15-1.png b/fig/02-data-visualisation-rendered-unnamed-chunk-15-1.png index 31e8aad7..a5aa0540 100644 Binary files a/fig/02-data-visualisation-rendered-unnamed-chunk-15-1.png and b/fig/02-data-visualisation-rendered-unnamed-chunk-15-1.png differ diff --git a/fig/02-data-visualisation-rendered-unnamed-chunk-16-1.png b/fig/02-data-visualisation-rendered-unnamed-chunk-16-1.png index 40dea10d..e641211d 100644 Binary files a/fig/02-data-visualisation-rendered-unnamed-chunk-16-1.png and b/fig/02-data-visualisation-rendered-unnamed-chunk-16-1.png differ diff --git a/fig/02-data-visualisation-rendered-unnamed-chunk-6-1.png b/fig/02-data-visualisation-rendered-unnamed-chunk-6-1.png index 555d8e02..c15e869a 100644 Binary files a/fig/02-data-visualisation-rendered-unnamed-chunk-6-1.png and b/fig/02-data-visualisation-rendered-unnamed-chunk-6-1.png differ diff --git a/fig/02-data-visualisation-rendered-unnamed-chunk-7-1.png b/fig/02-data-visualisation-rendered-unnamed-chunk-7-1.png index 590c1743..34c59f92 100644 Binary files a/fig/02-data-visualisation-rendered-unnamed-chunk-7-1.png and b/fig/02-data-visualisation-rendered-unnamed-chunk-7-1.png differ diff --git a/fig/02-data-visualisation-rendered-unnamed-chunk-8-1.png b/fig/02-data-visualisation-rendered-unnamed-chunk-8-1.png index 3a6acc03..d07d76f9 100644 Binary files a/fig/02-data-visualisation-rendered-unnamed-chunk-8-1.png and b/fig/02-data-visualisation-rendered-unnamed-chunk-8-1.png differ diff --git a/fig/02-data-visualisation-rendered-unnamed-chunk-9-1.png b/fig/02-data-visualisation-rendered-unnamed-chunk-9-1.png index a4848272..8fb3fb5f 100644 Binary files a/fig/02-data-visualisation-rendered-unnamed-chunk-9-1.png and b/fig/02-data-visualisation-rendered-unnamed-chunk-9-1.png differ diff --git a/fig/06-data-manipulation-rendered-unnamed-chunk-12-1.png b/fig/06-data-manipulation-rendered-unnamed-chunk-12-1.png index 98dea77b..55378543 100644 Binary files a/fig/06-data-manipulation-rendered-unnamed-chunk-12-1.png and b/fig/06-data-manipulation-rendered-unnamed-chunk-12-1.png differ diff --git a/md5sum.txt b/md5sum.txt index 9f2196cb..06918fff 100644 --- a/md5sum.txt +++ b/md5sum.txt @@ -1,20 +1,20 @@ "file" "checksum" "built" "date" -"CODE_OF_CONDUCT.md" "8d9e44dd5c39f241b5e8b47ecfc802d1" "site/built/CODE_OF_CONDUCT.md" "2024-05-21" -"LICENSE.md" "afaf427b4223952624dcb6d8ded53ec0" "site/built/LICENSE.md" "2024-05-21" -"config.yaml" "8a792282bff9e898778b401920b6f9a8" "site/built/config.yaml" "2024-05-21" -"index.md" "a02c9c785ed98ddd84fe3d34ddb12fcd" "site/built/index.md" "2024-05-21" -"links.md" "8184cf4149eafbf03ce8da8ff0778c14" "site/built/links.md" "2024-05-21" -"episodes/01-project-introduction.Rmd" "85a6bc9fb924dcaaad246c479be6ed42" "site/built/01-project-introduction.md" "2024-05-21" -"episodes/02-data-visualisation.Rmd" "f01af1c13c02f0248a05ea6893aa0ba8" "site/built/02-data-visualisation.md" "2024-05-21" -"episodes/03-data-subsetting.Rmd" "9380373f6f554a0f109a2e399c34e137" "site/built/03-data-subsetting.md" "2024-05-21" -"episodes/04-data-sorting-pipes.Rmd" "0642d7c251442344e27f7c057e2b714a" "site/built/04-data-sorting-pipes.md" "2024-05-21" -"episodes/05-data-plotting-scales.Rmd" "819480fb05a22b435f22d2b025f41cfa" "site/built/05-data-plotting-scales.md" "2024-05-21" -"episodes/06-data-manipulation.Rmd" "5fad3f4fbcaa6c605f2eb1928c8a55f7" "site/built/06-data-manipulation.md" "2024-05-21" -"episodes/07-data-reshaping.Rmd" "83a42d7216c1c0e502be56f57bc39abe" "site/built/07-data-reshaping.md" "2024-05-21" -"episodes/08-data-summaries.Rmd" "a37d3fecd97bb93441a8ac64c1d7d2aa" "site/built/08-data-summaries.md" "2024-05-21" -"episodes/09-data-complex-pipelines.Rmd" "b9c48d1e0c0000a46c8c1fd5e6c45a29" "site/built/09-data-complex-pipelines.md" "2024-05-21" -"episodes/10-data-manipulation-across.Rmd" "90d2a60c36c2d30825e5e0d6ad63c0d9" "site/built/10-data-manipulation-across.md" "2024-05-21" -"instructors/instructor-notes.md" "60b93493cf1da06dfd63255d73854461" "site/built/instructor-notes.md" "2024-05-21" -"learners/setup.md" "969ce71ddf0e8ed639bd94df7feb8858" "site/built/setup.md" "2024-05-21" -"profiles/learner-profiles.md" "60b93493cf1da06dfd63255d73854461" "site/built/learner-profiles.md" "2024-05-21" -"renv/profiles/lesson-requirements/renv.lock" "8fad97da09b4211feec2d78f61f9c88c" "site/built/renv.lock" "2024-05-21" +"CODE_OF_CONDUCT.md" "8d9e44dd5c39f241b5e8b47ecfc802d1" "site/built/CODE_OF_CONDUCT.md" "2024-06-04" +"LICENSE.md" "afaf427b4223952624dcb6d8ded53ec0" "site/built/LICENSE.md" "2024-06-04" +"config.yaml" "8a792282bff9e898778b401920b6f9a8" "site/built/config.yaml" "2024-06-04" +"index.md" "a02c9c785ed98ddd84fe3d34ddb12fcd" "site/built/index.md" "2024-06-04" +"links.md" "8184cf4149eafbf03ce8da8ff0778c14" "site/built/links.md" "2024-06-04" +"episodes/01-project-introduction.Rmd" "85a6bc9fb924dcaaad246c479be6ed42" "site/built/01-project-introduction.md" "2024-06-04" +"episodes/02-data-visualisation.Rmd" "f01af1c13c02f0248a05ea6893aa0ba8" "site/built/02-data-visualisation.md" "2024-06-04" +"episodes/03-data-subsetting.Rmd" "9380373f6f554a0f109a2e399c34e137" "site/built/03-data-subsetting.md" "2024-06-04" +"episodes/04-data-sorting-pipes.Rmd" "0642d7c251442344e27f7c057e2b714a" "site/built/04-data-sorting-pipes.md" "2024-06-04" +"episodes/05-data-plotting-scales.Rmd" "819480fb05a22b435f22d2b025f41cfa" "site/built/05-data-plotting-scales.md" "2024-06-04" +"episodes/06-data-manipulation.Rmd" "5fad3f4fbcaa6c605f2eb1928c8a55f7" "site/built/06-data-manipulation.md" "2024-06-04" +"episodes/07-data-reshaping.Rmd" "83a42d7216c1c0e502be56f57bc39abe" "site/built/07-data-reshaping.md" "2024-06-04" +"episodes/08-data-summaries.Rmd" "a37d3fecd97bb93441a8ac64c1d7d2aa" "site/built/08-data-summaries.md" "2024-06-04" +"episodes/09-data-complex-pipelines.Rmd" "b9c48d1e0c0000a46c8c1fd5e6c45a29" "site/built/09-data-complex-pipelines.md" "2024-06-04" +"episodes/10-data-manipulation-across.Rmd" "90d2a60c36c2d30825e5e0d6ad63c0d9" "site/built/10-data-manipulation-across.md" "2024-06-04" +"instructors/instructor-notes.md" "60b93493cf1da06dfd63255d73854461" "site/built/instructor-notes.md" "2024-06-04" +"learners/setup.md" "969ce71ddf0e8ed639bd94df7feb8858" "site/built/setup.md" "2024-06-04" +"profiles/learner-profiles.md" "60b93493cf1da06dfd63255d73854461" "site/built/learner-profiles.md" "2024-06-04" +"renv/profiles/lesson-requirements/renv.lock" "63aabdc32d22fc2ddd856fc7a8f56771" "site/built/renv.lock" "2024-06-04" diff --git a/renv.lock b/renv.lock deleted file mode 100644 index ac994119..00000000 --- a/renv.lock +++ /dev/null @@ -1,1532 +0,0 @@ -{ - "R": { - "Version": "4.4.0", - "Repositories": [ - { - "Name": "carpentries", - "URL": "https://carpentries.r-universe.dev" - }, - { - "Name": "carpentries_archive", - "URL": "https://carpentries.github.io/drat" - }, - { - "Name": "CRAN", - "URL": "https://cran.rstudio.com" - } - ] - }, - "Packages": { - "DBI": { - "Package": "DBI", - "Version": "1.2.2", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "methods" - ], - "Hash": "164809cd72e1d5160b4cb3aa57f510fe" - }, - "MASS": { - "Package": "MASS", - "Version": "7.3-60.0.1", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "grDevices", - "graphics", - "methods", - "stats", - "utils" - ], - "Hash": "b765b28387acc8ec9e9c1530713cb19c" - }, - "Matrix": { - "Package": "Matrix", - "Version": "1.6-5", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "grDevices", - "graphics", - "grid", - "lattice", - "methods", - "stats", - "utils" - ], - "Hash": "8c7115cd3a0e048bda2a7cd110549f7a" - }, - "R6": { - "Package": "R6", - "Version": "2.5.1", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R" - ], - "Hash": "470851b6d5d0ac559e9d01bb352b4021" - }, - "RColorBrewer": { - "Package": "RColorBrewer", - "Version": "1.1-3", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R" - ], - "Hash": "45f0398006e83a5b10b72a90663d8d8c" - }, - "askpass": { - "Package": "askpass", - "Version": "1.2.0", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "sys" - ], - "Hash": "cad6cf7f1d5f6e906700b9d3e718c796" - }, - "backports": { - "Package": "backports", - "Version": "1.4.1", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R" - ], - "Hash": "c39fbec8a30d23e721980b8afb31984c" - }, - "base64enc": { - "Package": "base64enc", - "Version": "0.1-3", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R" - ], - "Hash": "543776ae6848fde2f48ff3816d0628bc" - }, - "bit": { - "Package": "bit", - "Version": "4.0.5", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R" - ], - "Hash": "d242abec29412ce988848d0294b208fd" - }, - "bit64": { - "Package": "bit64", - "Version": "4.0.5", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "bit", - "methods", - "stats", - "utils" - ], - "Hash": "9fe98599ca456d6552421db0d6772d8f" - }, - "blob": { - "Package": "blob", - "Version": "1.2.4", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "methods", - "rlang", - "vctrs" - ], - "Hash": "40415719b5a479b87949f3aa0aee737c" - }, - "broom": { - "Package": "broom", - "Version": "1.0.5", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "backports", - "dplyr", - "ellipsis", - "generics", - "glue", - "lifecycle", - "purrr", - "rlang", - "stringr", - "tibble", - "tidyr" - ], - "Hash": "fd25391c3c4f6ecf0fa95f1e6d15378c" - }, - "bslib": { - "Package": "bslib", - "Version": "0.7.0", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "base64enc", - "cachem", - "fastmap", - "grDevices", - "htmltools", - "jquerylib", - "jsonlite", - "lifecycle", - "memoise", - "mime", - "rlang", - "sass" - ], - "Hash": "8644cc53f43828f19133548195d7e59e" - }, - "cachem": { - "Package": "cachem", - "Version": "1.0.8", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "fastmap", - "rlang" - ], - "Hash": "c35768291560ce302c0a6589f92e837d" - }, - "callr": { - "Package": "callr", - "Version": "3.7.6", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "R6", - "processx", - "utils" - ], - "Hash": "d7e13f49c19103ece9e58ad2d83a7354" - }, - "cellranger": { - "Package": "cellranger", - "Version": "1.1.0", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "rematch", - "tibble" - ], - "Hash": "f61dbaec772ccd2e17705c1e872e9e7c" - }, - "cli": { - "Package": "cli", - "Version": "3.6.2", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "utils" - ], - "Hash": "1216ac65ac55ec0058a6f75d7ca0fd52" - }, - "clipr": { - "Package": "clipr", - "Version": "0.8.0", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "utils" - ], - "Hash": "3f038e5ac7f41d4ac41ce658c85e3042" - }, - "colorspace": { - "Package": "colorspace", - "Version": "2.1-0", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "grDevices", - "graphics", - "methods", - "stats" - ], - "Hash": "f20c47fd52fae58b4e377c37bb8c335b" - }, - "conflicted": { - "Package": "conflicted", - "Version": "1.2.0", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "cli", - "memoise", - "rlang" - ], - "Hash": "bb097fccb22d156624fd07cd2894ddb6" - }, - "cpp11": { - "Package": "cpp11", - "Version": "0.4.7", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R" - ], - "Hash": "5a295d7d963cc5035284dcdbaf334f4e" - }, - "crayon": { - "Package": "crayon", - "Version": "1.5.2", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "grDevices", - "methods", - "utils" - ], - "Hash": "e8a1e41acf02548751f45c718d55aa6a" - }, - "curl": { - "Package": "curl", - "Version": "5.2.1", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R" - ], - "Hash": "411ca2c03b1ce5f548345d2fc2685f7a" - }, - "data.table": { - "Package": "data.table", - "Version": "1.15.4", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "methods" - ], - "Hash": "8ee9ac56ef633d0c7cab8b2ca87d683e" - }, - "dbplyr": { - "Package": "dbplyr", - "Version": "2.5.0", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "DBI", - "R", - "R6", - "blob", - "cli", - "dplyr", - "glue", - "lifecycle", - "magrittr", - "methods", - "pillar", - "purrr", - "rlang", - "tibble", - "tidyr", - "tidyselect", - "utils", - "vctrs", - "withr" - ], - "Hash": "39b2e002522bfd258039ee4e889e0fd1" - }, - "digest": { - "Package": "digest", - "Version": "0.6.35", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "utils" - ], - "Hash": "698ece7ba5a4fa4559e3d537e7ec3d31" - }, - "dplyr": { - "Package": "dplyr", - "Version": "1.1.4", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "R6", - "cli", - "generics", - "glue", - "lifecycle", - "magrittr", - "methods", - "pillar", - "rlang", - "tibble", - "tidyselect", - "utils", - "vctrs" - ], - "Hash": "fedd9d00c2944ff00a0e2696ccf048ec" - }, - "dtplyr": { - "Package": "dtplyr", - "Version": "1.3.1", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "cli", - "data.table", - "dplyr", - "glue", - "lifecycle", - "rlang", - "tibble", - "tidyselect", - "vctrs" - ], - "Hash": "54ed3ea01b11e81a86544faaecfef8e2" - }, - "ellipsis": { - "Package": "ellipsis", - "Version": "0.3.2", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "rlang" - ], - "Hash": "bb0eec2fe32e88d9e2836c2f73ea2077" - }, - "evaluate": { - "Package": "evaluate", - "Version": "0.23", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "methods" - ], - "Hash": "daf4a1246be12c1fa8c7705a0935c1a0" - }, - "fansi": { - "Package": "fansi", - "Version": "1.0.6", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "grDevices", - "utils" - ], - "Hash": "962174cf2aeb5b9eea581522286a911f" - }, - "farver": { - "Package": "farver", - "Version": "2.1.1", - "Source": "Repository", - "Repository": "CRAN", - "Hash": "8106d78941f34855c440ddb946b8f7a5" - }, - "fastmap": { - "Package": "fastmap", - "Version": "1.1.1", - "Source": "Repository", - "Repository": "CRAN", - "Hash": "f7736a18de97dea803bde0a2daaafb27" - }, - "fontawesome": { - "Package": "fontawesome", - "Version": "0.5.2", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "htmltools", - "rlang" - ], - "Hash": "c2efdd5f0bcd1ea861c2d4e2a883a67d" - }, - "forcats": { - "Package": "forcats", - "Version": "1.0.0", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "cli", - "glue", - "lifecycle", - "magrittr", - "rlang", - "tibble" - ], - "Hash": "1a0a9a3d5083d0d573c4214576f1e690" - }, - "fs": { - "Package": "fs", - "Version": "1.6.4", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "methods" - ], - "Hash": "15aeb8c27f5ea5161f9f6a641fafd93a" - }, - "gargle": { - "Package": "gargle", - "Version": "1.5.2", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "cli", - "fs", - "glue", - "httr", - "jsonlite", - "lifecycle", - "openssl", - "rappdirs", - "rlang", - "stats", - "utils", - "withr" - ], - "Hash": "fc0b272e5847c58cd5da9b20eedbd026" - }, - "generics": { - "Package": "generics", - "Version": "0.1.3", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "methods" - ], - "Hash": "15e9634c0fcd294799e9b2e929ed1b86" - }, - "ggplot2": { - "Package": "ggplot2", - "Version": "3.5.1", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "MASS", - "R", - "cli", - "glue", - "grDevices", - "grid", - "gtable", - "isoband", - "lifecycle", - "mgcv", - "rlang", - "scales", - "stats", - "tibble", - "vctrs", - "withr" - ], - "Hash": "44c6a2f8202d5b7e878ea274b1092426" - }, - "glue": { - "Package": "glue", - "Version": "1.7.0", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "methods" - ], - "Hash": "e0b3a53876554bd45879e596cdb10a52" - }, - "googledrive": { - "Package": "googledrive", - "Version": "2.1.1", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "cli", - "gargle", - "glue", - "httr", - "jsonlite", - "lifecycle", - "magrittr", - "pillar", - "purrr", - "rlang", - "tibble", - "utils", - "uuid", - "vctrs", - "withr" - ], - "Hash": "e99641edef03e2a5e87f0a0b1fcc97f4" - }, - "googlesheets4": { - "Package": "googlesheets4", - "Version": "1.1.1", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "cellranger", - "cli", - "curl", - "gargle", - "glue", - "googledrive", - "httr", - "ids", - "lifecycle", - "magrittr", - "methods", - "purrr", - "rematch2", - "rlang", - "tibble", - "utils", - "vctrs", - "withr" - ], - "Hash": "d6db1667059d027da730decdc214b959" - }, - "gtable": { - "Package": "gtable", - "Version": "0.3.5", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "cli", - "glue", - "grid", - "lifecycle", - "rlang" - ], - "Hash": "e18861963cbc65a27736e02b3cd3c4a0" - }, - "haven": { - "Package": "haven", - "Version": "2.5.4", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "cli", - "cpp11", - "forcats", - "hms", - "lifecycle", - "methods", - "readr", - "rlang", - "tibble", - "tidyselect", - "vctrs" - ], - "Hash": "9171f898db9d9c4c1b2c745adc2c1ef1" - }, - "highr": { - "Package": "highr", - "Version": "0.10", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "xfun" - ], - "Hash": "06230136b2d2b9ba5805e1963fa6e890" - }, - "hms": { - "Package": "hms", - "Version": "1.1.3", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "lifecycle", - "methods", - "pkgconfig", - "rlang", - "vctrs" - ], - "Hash": "b59377caa7ed00fa41808342002138f9" - }, - "htmltools": { - "Package": "htmltools", - "Version": "0.5.8.1", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "base64enc", - "digest", - "fastmap", - "grDevices", - "rlang", - "utils" - ], - "Hash": "81d371a9cc60640e74e4ab6ac46dcedc" - }, - "httr": { - "Package": "httr", - "Version": "1.4.7", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "R6", - "curl", - "jsonlite", - "mime", - "openssl" - ], - "Hash": "ac107251d9d9fd72f0ca8049988f1d7f" - }, - "ids": { - "Package": "ids", - "Version": "1.0.1", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "openssl", - "uuid" - ], - "Hash": "99df65cfef20e525ed38c3d2577f7190" - }, - "isoband": { - "Package": "isoband", - "Version": "0.2.7", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "grid", - "utils" - ], - "Hash": "0080607b4a1a7b28979aecef976d8bc2" - }, - "jquerylib": { - "Package": "jquerylib", - "Version": "0.1.4", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "htmltools" - ], - "Hash": "5aab57a3bd297eee1c1d862735972182" - }, - "jsonlite": { - "Package": "jsonlite", - "Version": "1.8.8", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "methods" - ], - "Hash": "e1b9c55281c5adc4dd113652d9e26768" - }, - "knitr": { - "Package": "knitr", - "Version": "1.46", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "evaluate", - "highr", - "methods", - "tools", - "xfun", - "yaml" - ], - "Hash": "6e008ab1d696a5283c79765fa7b56b47" - }, - "labeling": { - "Package": "labeling", - "Version": "0.4.3", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "graphics", - "stats" - ], - "Hash": "b64ec208ac5bc1852b285f665d6368b3" - }, - "lattice": { - "Package": "lattice", - "Version": "0.22-6", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "grDevices", - "graphics", - "grid", - "stats", - "utils" - ], - "Hash": "cc5ac1ba4c238c7ca9fa6a87ca11a7e2" - }, - "lifecycle": { - "Package": "lifecycle", - "Version": "1.0.4", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "cli", - "glue", - "rlang" - ], - "Hash": "b8552d117e1b808b09a832f589b79035" - }, - "lubridate": { - "Package": "lubridate", - "Version": "1.9.3", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "generics", - "methods", - "timechange" - ], - "Hash": "680ad542fbcf801442c83a6ac5a2126c" - }, - "magrittr": { - "Package": "magrittr", - "Version": "2.0.3", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R" - ], - "Hash": "7ce2733a9826b3aeb1775d56fd305472" - }, - "memoise": { - "Package": "memoise", - "Version": "2.0.1", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "cachem", - "rlang" - ], - "Hash": "e2817ccf4a065c5d9d7f2cfbe7c1d78c" - }, - "mgcv": { - "Package": "mgcv", - "Version": "1.9-1", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "Matrix", - "R", - "graphics", - "methods", - "nlme", - "splines", - "stats", - "utils" - ], - "Hash": "110ee9d83b496279960e162ac97764ce" - }, - "mime": { - "Package": "mime", - "Version": "0.12", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "tools" - ], - "Hash": "18e9c28c1d3ca1560ce30658b22ce104" - }, - "modelr": { - "Package": "modelr", - "Version": "0.1.11", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "broom", - "magrittr", - "purrr", - "rlang", - "tibble", - "tidyr", - "tidyselect", - "vctrs" - ], - "Hash": "4f50122dc256b1b6996a4703fecea821" - }, - "munsell": { - "Package": "munsell", - "Version": "0.5.1", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "colorspace", - "methods" - ], - "Hash": "4fd8900853b746af55b81fda99da7695" - }, - "nlme": { - "Package": "nlme", - "Version": "3.1-164", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "graphics", - "lattice", - "stats", - "utils" - ], - "Hash": "a623a2239e642806158bc4dc3f51565d" - }, - "openssl": { - "Package": "openssl", - "Version": "2.1.2", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "askpass" - ], - "Hash": "ea2475b073243d9d338aa8f086ce973e" - }, - "palmerpenguins": { - "Package": "palmerpenguins", - "Version": "0.1.1", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R" - ], - "Hash": "6c6861efbc13c1d543749e9c7be4a592" - }, - "pillar": { - "Package": "pillar", - "Version": "1.9.0", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "cli", - "fansi", - "glue", - "lifecycle", - "rlang", - "utf8", - "utils", - "vctrs" - ], - "Hash": "15da5a8412f317beeee6175fbc76f4bb" - }, - "pkgconfig": { - "Package": "pkgconfig", - "Version": "2.0.3", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "utils" - ], - "Hash": "01f28d4278f15c76cddbea05899c5d6f" - }, - "prettyunits": { - "Package": "prettyunits", - "Version": "1.2.0", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R" - ], - "Hash": "6b01fc98b1e86c4f705ce9dcfd2f57c7" - }, - "processx": { - "Package": "processx", - "Version": "3.8.4", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "R6", - "ps", - "utils" - ], - "Hash": "0c90a7d71988856bad2a2a45dd871bb9" - }, - "progress": { - "Package": "progress", - "Version": "1.2.3", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "R6", - "crayon", - "hms", - "prettyunits" - ], - "Hash": "f4625e061cb2865f111b47ff163a5ca6" - }, - "ps": { - "Package": "ps", - "Version": "1.7.6", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "utils" - ], - "Hash": "dd2b9319ee0656c8acf45c7f40c59de7" - }, - "purrr": { - "Package": "purrr", - "Version": "1.0.2", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "cli", - "lifecycle", - "magrittr", - "rlang", - "vctrs" - ], - "Hash": "1cba04a4e9414bdefc9dcaa99649a8dc" - }, - "ragg": { - "Package": "ragg", - "Version": "1.3.1", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "systemfonts", - "textshaping" - ], - "Hash": "539dcb63ce611fe389b4c8e2b546cf24" - }, - "rappdirs": { - "Package": "rappdirs", - "Version": "0.3.3", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R" - ], - "Hash": "5e3c5dc0b071b21fa128676560dbe94d" - }, - "readr": { - "Package": "readr", - "Version": "2.1.5", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "R6", - "cli", - "clipr", - "cpp11", - "crayon", - "hms", - "lifecycle", - "methods", - "rlang", - "tibble", - "tzdb", - "utils", - "vroom" - ], - "Hash": "9de96463d2117f6ac49980577939dfb3" - }, - "readxl": { - "Package": "readxl", - "Version": "1.4.3", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "cellranger", - "cpp11", - "progress", - "tibble", - "utils" - ], - "Hash": "8cf9c239b96df1bbb133b74aef77ad0a" - }, - "rematch": { - "Package": "rematch", - "Version": "2.0.0", - "Source": "Repository", - "Repository": "CRAN", - "Hash": "cbff1b666c6fa6d21202f07e2318d4f1" - }, - "rematch2": { - "Package": "rematch2", - "Version": "2.1.2", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "tibble" - ], - "Hash": "76c9e04c712a05848ae7a23d2f170a40" - }, - "renv": { - "Package": "renv", - "Version": "1.0.7", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "utils" - ], - "Hash": "397b7b2a265bc5a7a06852524dabae20" - }, - "reprex": { - "Package": "reprex", - "Version": "2.1.0", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "callr", - "cli", - "clipr", - "fs", - "glue", - "knitr", - "lifecycle", - "rlang", - "rmarkdown", - "rstudioapi", - "utils", - "withr" - ], - "Hash": "1425f91b4d5d9a8f25352c44a3d914ed" - }, - "rlang": { - "Package": "rlang", - "Version": "1.1.3", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "utils" - ], - "Hash": "42548638fae05fd9a9b5f3f437fbbbe2" - }, - "rmarkdown": { - "Package": "rmarkdown", - "Version": "2.26", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "bslib", - "evaluate", - "fontawesome", - "htmltools", - "jquerylib", - "jsonlite", - "knitr", - "methods", - "tinytex", - "tools", - "utils", - "xfun", - "yaml" - ], - "Hash": "9b148e7f95d33aac01f31282d49e4f44" - }, - "rstudioapi": { - "Package": "rstudioapi", - "Version": "0.16.0", - "Source": "Repository", - "Repository": "CRAN", - "Hash": "96710351d642b70e8f02ddeb237c46a7" - }, - "rvest": { - "Package": "rvest", - "Version": "1.0.4", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "cli", - "glue", - "httr", - "lifecycle", - "magrittr", - "rlang", - "selectr", - "tibble", - "xml2" - ], - "Hash": "0bcf0c6f274e90ea314b812a6d19a519" - }, - "sass": { - "Package": "sass", - "Version": "0.4.9", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R6", - "fs", - "htmltools", - "rappdirs", - "rlang" - ], - "Hash": "d53dbfddf695303ea4ad66f86e99b95d" - }, - "scales": { - "Package": "scales", - "Version": "1.3.0", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "R6", - "RColorBrewer", - "cli", - "farver", - "glue", - "labeling", - "lifecycle", - "munsell", - "rlang", - "viridisLite" - ], - "Hash": "c19df082ba346b0ffa6f833e92de34d1" - }, - "selectr": { - "Package": "selectr", - "Version": "0.4-2", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "R6", - "methods", - "stringr" - ], - "Hash": "3838071b66e0c566d55cc26bd6e27bf4" - }, - "stringi": { - "Package": "stringi", - "Version": "1.8.4", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "stats", - "tools", - "utils" - ], - "Hash": "39e1144fd75428983dc3f63aa53dfa91" - }, - "stringr": { - "Package": "stringr", - "Version": "1.5.1", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "cli", - "glue", - "lifecycle", - "magrittr", - "rlang", - "stringi", - "vctrs" - ], - "Hash": "960e2ae9e09656611e0b8214ad543207" - }, - "sys": { - "Package": "sys", - "Version": "3.4.2", - "Source": "Repository", - "Repository": "CRAN", - "Hash": "3a1be13d68d47a8cd0bfd74739ca1555" - }, - "systemfonts": { - "Package": "systemfonts", - "Version": "1.0.6", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "cpp11" - ], - "Hash": "6d538cff441f0f1f36db2209ac7495ac" - }, - "textshaping": { - "Package": "textshaping", - "Version": "0.3.7", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "cpp11", - "systemfonts" - ], - "Hash": "997aac9ad649e0ef3b97f96cddd5622b" - }, - "tibble": { - "Package": "tibble", - "Version": "3.2.1", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "fansi", - "lifecycle", - "magrittr", - "methods", - "pillar", - "pkgconfig", - "rlang", - "utils", - "vctrs" - ], - "Hash": "a84e2cc86d07289b3b6f5069df7a004c" - }, - "tidyr": { - "Package": "tidyr", - "Version": "1.3.1", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "cli", - "cpp11", - "dplyr", - "glue", - "lifecycle", - "magrittr", - "purrr", - "rlang", - "stringr", - "tibble", - "tidyselect", - "utils", - "vctrs" - ], - "Hash": "915fb7ce036c22a6a33b5a8adb712eb1" - }, - "tidyselect": { - "Package": "tidyselect", - "Version": "1.2.1", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "cli", - "glue", - "lifecycle", - "rlang", - "vctrs", - "withr" - ], - "Hash": "829f27b9c4919c16b593794a6344d6c0" - }, - "tidyverse": { - "Package": "tidyverse", - "Version": "2.0.0", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "broom", - "cli", - "conflicted", - "dbplyr", - "dplyr", - "dtplyr", - "forcats", - "ggplot2", - "googledrive", - "googlesheets4", - "haven", - "hms", - "httr", - "jsonlite", - "lubridate", - "magrittr", - "modelr", - "pillar", - "purrr", - "ragg", - "readr", - "readxl", - "reprex", - "rlang", - "rstudioapi", - "rvest", - "stringr", - "tibble", - "tidyr", - "xml2" - ], - "Hash": "c328568cd14ea89a83bd4ca7f54ae07e" - }, - "timechange": { - "Package": "timechange", - "Version": "0.3.0", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "cpp11" - ], - "Hash": "c5f3c201b931cd6474d17d8700ccb1c8" - }, - "tinytex": { - "Package": "tinytex", - "Version": "0.51", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "xfun" - ], - "Hash": "d44e2fcd2e4e076f0aac540208559d1d" - }, - "tzdb": { - "Package": "tzdb", - "Version": "0.4.0", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "cpp11" - ], - "Hash": "f561504ec2897f4d46f0c7657e488ae1" - }, - "utf8": { - "Package": "utf8", - "Version": "1.2.4", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R" - ], - "Hash": "62b65c52671e6665f803ff02954446e9" - }, - "uuid": { - "Package": "uuid", - "Version": "1.2-0", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R" - ], - "Hash": "303c19bfd970bece872f93a824e323d9" - }, - "vctrs": { - "Package": "vctrs", - "Version": "0.6.5", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "cli", - "glue", - "lifecycle", - "rlang" - ], - "Hash": "c03fa420630029418f7e6da3667aac4a" - }, - "viridisLite": { - "Package": "viridisLite", - "Version": "0.4.2", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R" - ], - "Hash": "c826c7c4241b6fc89ff55aaea3fa7491" - }, - "vroom": { - "Package": "vroom", - "Version": "1.6.5", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "bit64", - "cli", - "cpp11", - "crayon", - "glue", - "hms", - "lifecycle", - "methods", - "progress", - "rlang", - "stats", - "tibble", - "tidyselect", - "tzdb", - "vctrs", - "withr" - ], - "Hash": "390f9315bc0025be03012054103d227c" - }, - "withr": { - "Package": "withr", - "Version": "3.0.0", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "grDevices", - "graphics" - ], - "Hash": "d31b6c62c10dcf11ec530ca6b0dd5d35" - }, - "xfun": { - "Package": "xfun", - "Version": "0.43", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "grDevices", - "stats", - "tools" - ], - "Hash": "ab6371d8653ce5f2f9290f4ec7b42a8e" - }, - "xml2": { - "Package": "xml2", - "Version": "1.3.6", - "Source": "Repository", - "Repository": "RSPM", - "Requirements": [ - "R", - "cli", - "methods", - "rlang" - ], - "Hash": "1d0336142f4cd25d8d23cd3ba7a8fb61" - }, - "yaml": { - "Package": "yaml", - "Version": "2.3.8", - "Source": "Repository", - "Repository": "CRAN", - "Hash": "29240487a071f535f5e5d5a323b7afbd" - } - } -}