From e19ee2379ddbae2d71bb76572ea6d9b04849569f Mon Sep 17 00:00:00 2001 From: jules32 Date: Thu, 11 Jun 2020 21:11:06 -0700 Subject: [PATCH 1/4] open with viz basic --- .../index.Rmd | 94 +++++++------------ 1 file changed, 32 insertions(+), 62 deletions(-) diff --git a/content/post/drafts/2020-06-03-data-science-publishing/index.Rmd b/content/post/drafts/2020-06-03-data-science-publishing/index.Rmd index 396f2be..f940eb2 100644 --- a/content/post/drafts/2020-06-03-data-science-publishing/index.Rmd +++ b/content/post/drafts/2020-06-03-data-science-publishing/index.Rmd @@ -30,15 +30,13 @@ Last Week Julia Lowndes and I presented on a talk called: Abstract: -> The open source R programming language is a powerhouse for data analysis and statistics – and it also is fueling open publishing through RMarkdown and a large, engaged, and innovative community. We will show community-created examples of tutorials, blogs, websites, manuscripts, books, etc, and discuss how this is an entryway to open science, preprints, and open scientific publishing. We welcome other contributed examples to showcase how R can streamline open publishing, as well as examples showcasing other programming languages. +> The open source R programming language is a powerhouse for data analysis and statistics -- and it also is fueling open publishing through RMarkdown and a large, engaged, and innovative community. We will show community-created examples of tutorials, blogs, websites, manuscripts, books, etc, and discuss how this is an entryway to open science, preprints, and open scientific publishing. We welcome other contributed examples to showcase how R can streamline open publishing, as well as examples showcasing other programming languages. -The same workflow you use for data analysis -– rooted in reproducibility – empowers you make your work available to the world +The same workflow you use for data analysis -- rooted in reproducibility -- empowers you make your work available to the world ...in ways you never imagined -> I came to R for the data analysis, and was blown away by the publishing --- Julia Lowndes +> I came to R for the data analysis, and was blown away by the publishing -- Julia Lowndes # Using RMarkdown for scientific publishing: Fueling reproducibility in data science @@ -46,30 +44,28 @@ The same workflow you use for data analysis RMarkdown powerfully combines executable R code with simple text formatting and for efficient, automatable, reproducible research -Simple text formatting +Simple text formatting -+ +- -R Code +R Code -= += -Analyses and figures are in the same place as your reporting document: -saves time as you iterate! +Analyses and figures are in the same place as your reporting document: saves time as you iterate! Enables good practices for reproducibility & versioning -## RMarkdown’s familiar outputs for science: Word documents and PDFs +## RMarkdown's familiar outputs for science: Word documents and PDFs Rmarkdown renders to: -- Word -- PDF +- Word +- PDF -Imagine never copy-pasting a graph into your report again!!!! -RMarkdown can also manage citations, cross- referencing figures and section headers. +Imagine never copy-pasting a graph into your report again!!!! RMarkdown can also manage citations, cross- referencing figures and section headers. -# Using RMarkdown beyond your wildest dreams: Reimagining sharing and publishing online +# Using RMarkdown beyond your wildest dreams: Reimagining sharing and publishing online ## RMarkdown: RMarkdown creates HTML files that can be shared openly on the web @@ -77,62 +73,54 @@ Rendering rmarkdown to HTML: > We can store and distribute html files on GitHub, which also offers display options for publishing. Let's look at some real-world examples from science... -> Suddenly you can share a URL rather than attaching a file! -And that same URL will update rather than re-attaching a new version of the file! - +> Suddenly you can share a URL rather than attaching a file! And that same URL will update rather than re-attaching a new version of the file! ## Single-page html RMarkdown html files for open publishing; URL will display most recent version -Examples from the [Ocean Health Index](https://ohi-science.org/) +Examples from the [Ocean Health Index](https://ohi-science.org/) Many display options; floating table of contents, show/hide code Then we can think about organization & discoverability: How to organize multiple htmls? And how do we find them? -ohi-science.org/ohiprep_v2019/globalprep/prs_slr/v2019/slr_layer_prep_v2.html -Learn: rmarkdown.rstudio.com +ohi-science.org/ohiprep\_v2019/globalprep/prs\_slr/v2019/slr\_layer\_prep\_v2.html Learn: rmarkdown.rstudio.com ## Simple Websites Combine RMarkdown files as a website with a navigation bar between pages, requires only GitHub - -Useful for organizing, e.g. linking out to additional single-page htmls +Useful for organizing, e.g. linking out to additional single-page htmls You can also create templates and populate them automatically -ohi-science.org/ohi-global -ohi-science.org/esw +ohi-science.org/ohi-global ohi-science.org/esw\ Learn: jules32.github.io/rmarkdown-website-tutorial -## Blogdown websites +## Blogdown websites Create powerful websites with more complexity and blogging capabilities; requires more setup & deployment from a server -“If you want to learn to write, you read a lot, if you want to play music, you listen a lot. It’s hard to do this with data analysis.” - Hilary Parker & Roger Peng, RStudio::conf(2020) keynote +"If you want to learn to write, you read a lot, if you want to play music, you listen a lot. It's hard to do this with data analysis." - Hilary Parker & Roger Peng, RStudio::conf(2020) keynote So we write blogs and tutorials to share code, discuss, and learn together. -Power to organize, tag, search, navigage, etc. +Power to organize, tag, search, navigage, etc. ← Academic theme templates! -alison.rbind.io -Learn: alison.rbind.io/post 2017-06-12-up-and-running-with-blogdown +alison.rbind.io Learn: alison.rbind.io/post 2017-06-12-up-and-running-with-blogdown ## Bookdown books Organize and navigate html files as e-books -Really powerful for organizing reports and documents. +Really powerful for organizing reports and documents. -I wish I could have written my PhD thesis is Bookdown -Eg: github.com/ benmarwick/huskydown +I wish I could have written my PhD thesis is Bookdown Eg: github.com/ benmarwick/huskydown -r4ds.had.co.nz -Learn: bookdown.org/yihui/bookdown +r4ds.had.co.nz Learn: bookdown.org/yihui/bookdown ## Simple slides: create slides in a single rmarkdown file @@ -140,48 +128,30 @@ Imagine re-creating presentations with updated data. Text-based slide creation can be a powerful flow to think and outline. -Share presentations – and with a human-readable url! +Share presentations -- and with a human-readable url! -rstudio.com/slides/rstudio-pbc -Learn: rmarkdown.rstudio.com/lesson-11 +rstudio.com/slides/rstudio-pbc Learn: rmarkdown.rstudio.com/lesson-11 -## Xaringan Slides: Create slides in a single RMarkdown file +## Xaringan Slides: Create slides in a single RMarkdown file Incorporate powerful styling options from within R (without requiring knowledge of JavaScript, CSS, etc) -slides.yihui.org/xaringan -arm.rbind.io/slides/xaringan -Learn: above, and bookdown.org/yihui/rmarkdown/xaringan +slides.yihui.org/xaringan arm.rbind.io/slides/xaringan Learn: above, and bookdown.org/yihui/rmarkdown/xaringan ## Learnr tutorials: Interactive tutorials from a friendly website interface Reimagine teaching and how to blend lectures and hands-on coding for learners of all levels -allisonhorst.shinyapps.io/missingexplorer +allisonhorst.shinyapps.io/missingexplorer Learn: education.rstudio.com/blog/2020/05/learnr-for-remote - # Discussion time What examples or questions do you have? Other discussion topics -How does RMarkdown relate to/streamline the academic publishing process? -Analog: rOpenSci software review process -Friendly entryways to open science & publishing : you’re already doing it w/ code -Process affects the outcome: Easier to share at the end because you’re already sharing with yourself throughout -Not just R! Examples from other languages (Jupyter [note]books) -Open publishing in the wild -Education: allisonhorst.github.io, datavizm20.classes.andrewheiss.com, tinystats.github.io/teacups-giraffes-and-statistics, ida.numbat.space -Programs: openscapes.org -Accompanying science pubs: ohi-science.org/betterscienceinlesstime - -RMarkdown <> Word workflows: noamross.github.io/redoc -Nick’s experience writing his thesis in bookdown: how does it compare to latex? -Incorporating RMarkdown sub-documents (“knit child”): OHI suppl. methods -How to share documents using GitHub’s gh-pages or doc/: R for Excel Users - - +How does RMarkdown relate to/streamline the academic publishing process? Analog: rOpenSci software review process Friendly entryways to open science & publishing : you're already doing it w/ code Process affects the outcome: Easier to share at the end because you're already sharing with yourself throughout Not just R! Examples from other languages (Jupyter [note]books) Open publishing in the wild Education: allisonhorst.github.io, datavizm20.classes.andrewheiss.com, tinystats.github.io/teacups-giraffes-and-statistics, ida.numbat.space Programs: openscapes.org Accompanying science pubs: ohi-science.org/betterscienceinlesstime +RMarkdown \<\> Word workflows: noamross.github.io/redoc Nick's experience writing his thesis in bookdown: how does it compare to latex? Incorporating RMarkdown sub-documents ("knit child"): OHI suppl. methods How to share documents using GitHub's gh-pages or doc/: R for Excel Users From 4641d74ca15b62bb50a0a65944ab52976261aa41 Mon Sep 17 00:00:00 2001 From: jules32 Date: Thu, 11 Jun 2020 22:37:34 -0700 Subject: [PATCH 2/4] reflow a bit --- .../index.Rmd | 119 +++--------------- 1 file changed, 20 insertions(+), 99 deletions(-) diff --git a/content/post/drafts/2020-06-03-data-science-publishing/index.Rmd b/content/post/drafts/2020-06-03-data-science-publishing/index.Rmd index f940eb2..ba10012 100644 --- a/content/post/drafts/2020-06-03-data-science-publishing/index.Rmd +++ b/content/post/drafts/2020-06-03-data-science-publishing/index.Rmd @@ -1,6 +1,6 @@ --- title: Data Science as an Entryway to Open Publishing -author: Julia Lowndes, Nicholas Tierney +author: Nicholas Tierney and Julia Lowndes date: '2020-06-03' slug: data-science-publishing draft: true @@ -24,133 +24,54 @@ knitr::opts_chunk$set( ) ``` -Last Week Julia Lowndes and I presented on a talk called: +*Last month Julia Lowndes and I presented a virtual fireside chat at the [Open Publishing Fest](https://openpublishingfest.org/) called "Data Science as an Entryway to Open Publishing". The abstract of our fireside chat was: The open source R programming language is a powerhouse for data analysis and statistics -- and it also is fueling open publishing through RMarkdown and a large, engaged, and innovative community. We will show community-created examples of tutorials, blogs, websites, manuscripts, books, etc, and discuss how this is an entryway to open science, preprints, and open scientific publishing. We welcome other contributed examples to showcase how R can streamline open publishing, as well as examples showcasing other programming languages. This post is some reflections from the experience and summary of our [slides](https://zenodo.org/record/3873698#.XtbQo8Z7nOQ).* -"Data Science as an Entryway to Open Publishing", you can see the slides [here](https://zenodo.org/record/3873698#.XtbQo8Z7nOQ). - -Abstract: +--- -> The open source R programming language is a powerhouse for data analysis and statistics -- and it also is fueling open publishing through RMarkdown and a large, engaged, and innovative community. We will show community-created examples of tutorials, blogs, websites, manuscripts, books, etc, and discuss how this is an entryway to open science, preprints, and open scientific publishing. We welcome other contributed examples to showcase how R can streamline open publishing, as well as examples showcasing other programming languages. +One of the (many) things that gets us excited about R is that the same workflow you use for data analysis -- that is rooted in reproducibility -- empowers you make your work available to the world...in ways you never imagined. -The same workflow you use for data analysis -- rooted in reproducibility -- empowers you make your work available to the world +As Julia said in our presentation: -...in ways you never imagined +> I came to R for the data analysis, and was blown away by the publishing -> I came to R for the data analysis, and was blown away by the publishing -- Julia Lowndes +We then introduced RMarkdown, framed for scientific publishing and so much more. # Using RMarkdown for scientific publishing: Fueling reproducibility in data science -## Rmarkdown - -RMarkdown powerfully combines executable R code with simple text formatting and for efficient, automatable, reproducible research - -Simple text formatting - -- +## RMarkdown -R Code - -= - -Analyses and figures are in the same place as your reporting document: saves time as you iterate! - -Enables good practices for reproducibility & versioning +RMarkdown powerfully combines executable R code with simple text formatting for efficient, automatable, reproducible research. It combines simple text formatting with R code, which means analyses and figures are in the same place as your reporting document. This saves time as you iterate, and enables good practices for reproducibility & versioning. ## RMarkdown's familiar outputs for science: Word documents and PDFs -Rmarkdown renders to: +RMarkdown renders to Word and PDF — imagine never copy-pasting a graph into your report again!!!! RMarkdown can also manage citations, cross-referencing figures and section headers. -- Word -- PDF - -Imagine never copy-pasting a graph into your report again!!!! RMarkdown can also manage citations, cross- referencing figures and section headers. +But wait, you can also use RMarkdown behind your wildest dreams. # Using RMarkdown beyond your wildest dreams: Reimagining sharing and publishing online ## RMarkdown: RMarkdown creates HTML files that can be shared openly on the web -Rendering rmarkdown to HTML: - -> We can store and distribute html files on GitHub, which also offers display options for publishing. Let's look at some real-world examples from science... - -> Suddenly you can share a URL rather than attaching a file! And that same URL will update rather than re-attaching a new version of the file! - -## Single-page html - -RMarkdown html files for open publishing; URL will display most recent version - -Examples from the [Ocean Health Index](https://ohi-science.org/) - -Many display options; floating table of contents, show/hide code - -Then we can think about organization & discoverability: How to organize multiple htmls? And how do we find them? - -ohi-science.org/ohiprep\_v2019/globalprep/prs\_slr/v2019/slr\_layer\_prep\_v2.html Learn: rmarkdown.rstudio.com - -## Simple Websites - -Combine RMarkdown files as a website with a navigation bar between pages, requires only GitHub - -Useful for organizing, e.g. linking out to additional single-page htmls - -You can also create templates and populate them automatically - -ohi-science.org/ohi-global ohi-science.org/esw\ -Learn: jules32.github.io/rmarkdown-website-tutorial - -## Blogdown websites - -Create powerful websites with more complexity and blogging capabilities; requires more setup & deployment from a server - -"If you want to learn to write, you read a lot, if you want to play music, you listen a lot. It's hard to do this with data analysis." - Hilary Parker & Roger Peng, RStudio::conf(2020) keynote - -So we write blogs and tutorials to share code, discuss, and learn together. - -Power to organize, tag, search, navigage, etc. - -← Academic theme templates! - -alison.rbind.io Learn: alison.rbind.io/post 2017-06-12-up-and-running-with-blogdown - -## Bookdown books - -Organize and navigate html files as e-books - -Really powerful for organizing reports and documents. - -I wish I could have written my PhD thesis is Bookdown Eg: github.com/ benmarwick/huskydown - -r4ds.had.co.nz Learn: bookdown.org/yihui/bookdown - -## Simple slides: create slides in a single rmarkdown file - -Imagine re-creating presentations with updated data. - -Text-based slide creation can be a powerful flow to think and outline. - -Share presentations -- and with a human-readable url! - -rstudio.com/slides/rstudio-pbc Learn: rmarkdown.rstudio.com/lesson-11 +We can render RMarkdown to HTML. And we can store and distribute HTML files on GitHub, which also offers display options for publishing. Suddenly you can share a URL rather than attaching a file — and that same URL will update rather than re-attaching a new version of the file! -## Xaringan Slides: Create slides in a single RMarkdown file +We shared some real-world examples from science, including examples from the [Ocean Health Index](https://ohi-science.org/), [Alison Hill's academic](https://alison.rbind.io/post 2017-06-12-up-and-running-with-blogdown), [Ben Marwick's PhD thesis template formatting for the University of Washington ](https://github.com/benmarwick/huskydown), [Allison Horst's missing explorer lesson](https://allisonhorst.shinyapps.io/missingexplorer) -Incorporate powerful styling options from within R (without requiring knowledge of JavaScript, CSS, etc) +These examples started with **single-page HTMLs**, with the ability to display floating table of contents and toggle between showing hiding code. And also **simple websites** that combine RMarkdown files as a website with a navigation bar between pages, and requires only GitHub to display. -slides.yihui.org/xaringan arm.rbind.io/slides/xaringan Learn: above, and bookdown.org/yihui/rmarkdown/xaringan +We also discussed **blogdown** that creates powerful websites with more complexity and blogging capabilities. This has been so important for creating blogs and tutorials to share code, discuss, and learn together. Why this important is nicely represented in this quote from the 2020 RStudio conference: -## Learnr tutorials: Interactive tutorials from a friendly website interface +> "If you want to learn to write, you read a lot, if you want to play music, you listen a lot. It's hard to do this with data analysis." - Hilary Parker & Roger Peng, RStudio::conf(2020) keynote -Reimagine teaching and how to blend lectures and hands-on coding for learners of all levels +In addition to websites, RMarkdown can create **bookdown books** that organize and navigate html files as e-books. This is really powerful for organizing reports and documents. -allisonhorst.shinyapps.io/missingexplorer +We can also create **simple slides** from a single RMarkdown file. Imagine making a presentation and then being able to re-create presentations with updated data. Further, we can create **xaringan slides** that enable you to incorporate powerful styling options from within R (without requiring knowledge of JavaScript, CSS, etc). -Learn: education.rstudio.com/blog/2020/05/learnr-for-remote +**learnr** provides the power of interactive tutorials from a friendly website interface. This is really exciting to think about reimagining teaching and how to blend lectures and hands-on coding for learners of all levels. # Discussion time -What examples or questions do you have? +After going through the slides, we discussed with attendees...TODO complete -Other discussion topics How does RMarkdown relate to/streamline the academic publishing process? Analog: rOpenSci software review process Friendly entryways to open science & publishing : you're already doing it w/ code Process affects the outcome: Easier to share at the end because you're already sharing with yourself throughout Not just R! Examples from other languages (Jupyter [note]books) Open publishing in the wild Education: allisonhorst.github.io, datavizm20.classes.andrewheiss.com, tinystats.github.io/teacups-giraffes-and-statistics, ida.numbat.space Programs: openscapes.org Accompanying science pubs: ohi-science.org/betterscienceinlesstime From 4d6e4709a03567f0a3d96e91d704dde36406990c Mon Sep 17 00:00:00 2001 From: Nicholas Tierney Date: Tue, 16 Jun 2020 23:49:49 +1000 Subject: [PATCH 3/4] Add a bit more on discussion topics from the chat --- .../index.Rmd | 39 +++++++++++++++++-- 1 file changed, 35 insertions(+), 4 deletions(-) diff --git a/content/post/drafts/2020-06-03-data-science-publishing/index.Rmd b/content/post/drafts/2020-06-03-data-science-publishing/index.Rmd index ba10012..5ea2ec4 100644 --- a/content/post/drafts/2020-06-03-data-science-publishing/index.Rmd +++ b/content/post/drafts/2020-06-03-data-science-publishing/index.Rmd @@ -1,6 +1,6 @@ --- title: Data Science as an Entryway to Open Publishing -author: Nicholas Tierney and Julia Lowndes +author: Julia Lowndes and Nicholas Tierney date: '2020-06-03' slug: data-science-publishing draft: true @@ -24,7 +24,7 @@ knitr::opts_chunk$set( ) ``` -*Last month Julia Lowndes and I presented a virtual fireside chat at the [Open Publishing Fest](https://openpublishingfest.org/) called "Data Science as an Entryway to Open Publishing". The abstract of our fireside chat was: The open source R programming language is a powerhouse for data analysis and statistics -- and it also is fueling open publishing through RMarkdown and a large, engaged, and innovative community. We will show community-created examples of tutorials, blogs, websites, manuscripts, books, etc, and discuss how this is an entryway to open science, preprints, and open scientific publishing. We welcome other contributed examples to showcase how R can streamline open publishing, as well as examples showcasing other programming languages. This post is some reflections from the experience and summary of our [slides](https://zenodo.org/record/3873698#.XtbQo8Z7nOQ).* +_Last month Julia Lowndes and I presented a virtual fireside chat at the [Open Publishing Fest](https://openpublishingfest.org/) called "Data Science as an Entryway to Open Publishing". The abstract of our fireside chat was: The open source R programming language is a powerhouse for data analysis and statistics -- and it also is fueling open publishing through RMarkdown and a large, engaged, and innovative community. We will show community-created examples of tutorials, blogs, websites, manuscripts, books, etc, and discuss how this is an entryway to open science, preprints, and open scientific publishing. We welcome other contributed examples to showcase how R can streamline open publishing, as well as examples showcasing other programming languages. This post is some reflections from the experience and summary of our [slides](https://zenodo.org/record/3873698#.XtbQo8Z7nOQ)._ --- @@ -44,7 +44,7 @@ RMarkdown powerfully combines executable R code with simple text formatting for ## RMarkdown's familiar outputs for science: Word documents and PDFs -RMarkdown renders to Word and PDF — imagine never copy-pasting a graph into your report again!!!! RMarkdown can also manage citations, cross-referencing figures and section headers. +RMarkdown renders to Word and PDF — imagine never copy-pasting a graph into your report again!!! RMarkdown can also manage citations, cross-referencing figures and section headers. But wait, you can also use RMarkdown behind your wildest dreams. @@ -70,7 +70,38 @@ We can also create **simple slides** from a single RMarkdown file. Imagine makin # Discussion time -After going through the slides, we discussed with attendees...TODO complete +After going through the slides, we discussed with attendees the benefits of a HTML-focussed workflow. One of the benefits of this is that by avoiding page breaks, all your figures and tables can usually be placed right where they are mentioned. Although this might seem like a small detail, avoiding page breaks actually saved you a huge amount of time and hassle. [Adam Sparks]() discussed that one of the benefits to HTML is the ease of sharing these in a team internally - they do not have to be published online, and can be opened in any browser. The fact that they often look really snappy and polished, and can include interactive elements like maps is also a huge selling point. + +We also discussed how [`pagedown`]() provides a fresh approach to generating PDF on the web, and were lucky to have the creator of [paged.js]() (which powers `pagedown`) [Adam Hyde]() in our call. + +We also discussed some alternative formats for publishing, such as JATS, a journal publishing standard for XML. We hadn't heard of it, (and apparently that was a good thing). + +[Alison Hill]() made an excellent point: + +> As a former scientist, I felt woefully ill-equipped when first working with HTML output. I know you all have thought a lot about this- how can we increase HTML comfort levels and fluency for new/early scientists? + +In response to this we discussed some of the downfalls of HTML, namely the fact that CSS is usually required to answer questions like: + +* "How do I change the font size" +* "How do I change the font colour", or +* "how do I create two columns of text" + +While CSS is uniquitous, and everywhere on the web, so it can be easy to change appearances of text. It is an additional learning point for learners, and something that can tip the balance and turn people away and back to systems they know. + +Some suggestions on addressing this were (I think?): + +* Showing students existing Rmarkdown HTML templates +* Providing simple CSS templates within an rmarkdown file for people to use +* Building better tools that guide people to create their own CSS + +We had another interesting question from [John Chodacki]() + +> What about readers/consumers? But it seems like all the cool features of auto-updating tables, etc. can bring confusion for the reader ... unable to rely on stable info. Do you agree? Are there innovative ways to mitigate? + +This problem can arise when you generate a graphic, or a model during data analysis, and iterate on it and improve it. Later one, you might want to compare your current graphic to your first one, or your first model to the current one. However, doing that is actually pretty hard, and involves some strong version control skills. We discussed an approach that [Miles McBain]() broached a few years ago called [journalr](https://ghcdn.rawgit.org/MilesMcBain/journalr/master/Journalling_tool_proposal.html) ([repo](https://github.com/MilesMcBain/journalr)). + +Ultimately, this is a hard problem to solve, and mimics a real life pen and paper notebook. [Roger Peng]() discussed an approach to this in [NSSDeviations]() (around episode 74 I think?), involving a manual approach + How does RMarkdown relate to/streamline the academic publishing process? Analog: rOpenSci software review process Friendly entryways to open science & publishing : you're already doing it w/ code Process affects the outcome: Easier to share at the end because you're already sharing with yourself throughout Not just R! Examples from other languages (Jupyter [note]books) Open publishing in the wild Education: allisonhorst.github.io, datavizm20.classes.andrewheiss.com, tinystats.github.io/teacups-giraffes-and-statistics, ida.numbat.space Programs: openscapes.org Accompanying science pubs: ohi-science.org/betterscienceinlesstime From ee2b5144e9d67c15c8a6aee829942bf5038ded93 Mon Sep 17 00:00:00 2001 From: jules32 Date: Wed, 1 Jul 2020 06:01:08 -0700 Subject: [PATCH 4/4] reviewing NT's edits, tidying for publishing --- .../index.Rmd | 57 +++++++++---------- 1 file changed, 26 insertions(+), 31 deletions(-) diff --git a/content/post/drafts/2020-06-03-data-science-publishing/index.Rmd b/content/post/drafts/2020-06-03-data-science-publishing/index.Rmd index 5ea2ec4..ddc76de 100644 --- a/content/post/drafts/2020-06-03-data-science-publishing/index.Rmd +++ b/content/post/drafts/2020-06-03-data-science-publishing/index.Rmd @@ -24,15 +24,15 @@ knitr::opts_chunk$set( ) ``` -_Last month Julia Lowndes and I presented a virtual fireside chat at the [Open Publishing Fest](https://openpublishingfest.org/) called "Data Science as an Entryway to Open Publishing". The abstract of our fireside chat was: The open source R programming language is a powerhouse for data analysis and statistics -- and it also is fueling open publishing through RMarkdown and a large, engaged, and innovative community. We will show community-created examples of tutorials, blogs, websites, manuscripts, books, etc, and discuss how this is an entryway to open science, preprints, and open scientific publishing. We welcome other contributed examples to showcase how R can streamline open publishing, as well as examples showcasing other programming languages. This post is some reflections from the experience and summary of our [slides](https://zenodo.org/record/3873698#.XtbQo8Z7nOQ)._ +*In May we presented a virtual fireside chat at the [Open Publishing Fest](https://openpublishingfest.org/) called "Data Science as an Entryway to Open Publishing". The premise was that the open source R programming language is a powerhouse for data analysis and statistics -- and it also is fueling open publishing through RMarkdown and a large, engaged, and innovative community. We briefly showed community-created examples of tutorials, blogs, websites, manuscripts, books, etc, and discussed how they are an entryway to open science, preprints, and open scientific publishing. This post is some reflections from the experience and summary of our [slides](https://zenodo.org/record/3873698#.XtbQo8Z7nOQ).* ---- +------------------------------------------------------------------------ One of the (many) things that gets us excited about R is that the same workflow you use for data analysis -- that is rooted in reproducibility -- empowers you make your work available to the world...in ways you never imagined. As Julia said in our presentation: -> I came to R for the data analysis, and was blown away by the publishing +> I came to R for the data analysis, and was blown away by the publishing We then introduced RMarkdown, framed for scientific publishing and so much more. @@ -44,7 +44,7 @@ RMarkdown powerfully combines executable R code with simple text formatting for ## RMarkdown's familiar outputs for science: Word documents and PDFs -RMarkdown renders to Word and PDF — imagine never copy-pasting a graph into your report again!!! RMarkdown can also manage citations, cross-referencing figures and section headers. +RMarkdown renders to Word and PDF --- imagine never copy-pasting a graph into your report again!!! RMarkdown can also manage citations, cross-referencing figures and section headers. But wait, you can also use RMarkdown behind your wildest dreams. @@ -52,58 +52,53 @@ But wait, you can also use RMarkdown behind your wildest dreams. ## RMarkdown: RMarkdown creates HTML files that can be shared openly on the web -We can render RMarkdown to HTML. And we can store and distribute HTML files on GitHub, which also offers display options for publishing. Suddenly you can share a URL rather than attaching a file — and that same URL will update rather than re-attaching a new version of the file! +We can render RMarkdown to HTML. And we can store and distribute HTML files on GitHub, which also offers display options for publishing. Suddenly you can share a URL rather than attaching a file --- and that same URL will update rather than re-attaching a new version of the file! -We shared some real-world examples from science, including examples from the [Ocean Health Index](https://ohi-science.org/), [Alison Hill's academic](https://alison.rbind.io/post 2017-06-12-up-and-running-with-blogdown), [Ben Marwick's PhD thesis template formatting for the University of Washington ](https://github.com/benmarwick/huskydown), [Allison Horst's missing explorer lesson](https://allisonhorst.shinyapps.io/missingexplorer) +We shared some real-world examples from science, including examples from the [Ocean Health Index](https://ohi-science.org/), [Alison Hill's academic](https://alison.rbind.io/post%202017-06-12-up-and-running-with-blogdown), [Ben Marwick's PhD thesis template formatting for the University of Washington](https://github.com/benmarwick/huskydown), [Allison Horst's missing explorer lesson](https://allisonhorst.shinyapps.io/missingexplorer) -These examples started with **single-page HTMLs**, with the ability to display floating table of contents and toggle between showing hiding code. And also **simple websites** that combine RMarkdown files as a website with a navigation bar between pages, and requires only GitHub to display. +These examples started with **single-page HTMLs**, with the ability to display floating table of contents and toggle between showing hiding code. And also **simple websites** that combine RMarkdown files as a website with a navigation bar between pages, and requires only GitHub to display. -We also discussed **blogdown** that creates powerful websites with more complexity and blogging capabilities. This has been so important for creating blogs and tutorials to share code, discuss, and learn together. Why this important is nicely represented in this quote from the 2020 RStudio conference: +We also discussed **blogdown** that creates powerful websites with more complexity and blogging capabilities. This has been so important for creating blogs and tutorials to share code, discuss, and learn together. Why this important is nicely represented in this quote from the 2020 RStudio conference: -> "If you want to learn to write, you read a lot, if you want to play music, you listen a lot. It's hard to do this with data analysis." - Hilary Parker & Roger Peng, RStudio::conf(2020) keynote +> "If you want to learn to write, you read a lot, if you want to play music, you listen a lot. It's hard to do this with data analysis." - [Hilary Parker & Roger Peng, RStudio::conf(2020) keynote]() In addition to websites, RMarkdown can create **bookdown books** that organize and navigate html files as e-books. This is really powerful for organizing reports and documents. -We can also create **simple slides** from a single RMarkdown file. Imagine making a presentation and then being able to re-create presentations with updated data. Further, we can create **xaringan slides** that enable you to incorporate powerful styling options from within R (without requiring knowledge of JavaScript, CSS, etc). +We can also create **simple slides** from a single RMarkdown file -- imagine making a presentation and then being able to re-create presentations with updated data! Further, we can create **xaringan slides** that enable you to incorporate powerful styling options from within R (without requiring knowledge of JavaScript, CSS, etc). -**learnr** provides the power of interactive tutorials from a friendly website interface. This is really exciting to think about reimagining teaching and how to blend lectures and hands-on coding for learners of all levels. +**learnr** provides the power of interactive tutorials from a friendly website interface. This is really exciting to think about reimagining teaching and how to blend lectures and hands-on coding for learners of all levels. # Discussion time -After going through the slides, we discussed with attendees the benefits of a HTML-focussed workflow. One of the benefits of this is that by avoiding page breaks, all your figures and tables can usually be placed right where they are mentioned. Although this might seem like a small detail, avoiding page breaks actually saved you a huge amount of time and hassle. [Adam Sparks]() discussed that one of the benefits to HTML is the ease of sharing these in a team internally - they do not have to be published online, and can be opened in any browser. The fact that they often look really snappy and polished, and can include interactive elements like maps is also a huge selling point. +After going through the slides, we discussed with attendees the benefits of a HTML-focussed workflow. One of the benefits of this is that by avoiding page breaks, all your figures and tables can usually be placed right where they are mentioned. Although this might seem like a small detail, avoiding page breaks actually saved you a huge amount of time and hassle. Adam Sparks discussed that one of the benefits to HTML is the ease of sharing these in a team internally - they do not have to be published online, and can be opened in any browser. The fact that they often look really snappy and polished, and can include interactive elements like maps is also a huge selling point. -We also discussed how [`pagedown`]() provides a fresh approach to generating PDF on the web, and were lucky to have the creator of [paged.js]() (which powers `pagedown`) [Adam Hyde]() in our call. +We also discussed how `pagedown` provides a fresh approach to generating PDFs on the web, and were lucky to have the creator of [paged.js]() (which powers `pagedown`) Adam Hyde at our chat. We also discussed some alternative formats for publishing, such as JATS, a journal publishing standard for XML. We hadn't heard of it, (and apparently that was a good thing). -[Alison Hill]() made an excellent point: +Alison Hill made an excellent point: > As a former scientist, I felt woefully ill-equipped when first working with HTML output. I know you all have thought a lot about this- how can we increase HTML comfort levels and fluency for new/early scientists? In response to this we discussed some of the downfalls of HTML, namely the fact that CSS is usually required to answer questions like: -* "How do I change the font size" -* "How do I change the font colour", or -* "how do I create two columns of text" - -While CSS is uniquitous, and everywhere on the web, so it can be easy to change appearances of text. It is an additional learning point for learners, and something that can tip the balance and turn people away and back to systems they know. - -Some suggestions on addressing this were (I think?): - -* Showing students existing Rmarkdown HTML templates -* Providing simple CSS templates within an rmarkdown file for people to use -* Building better tools that guide people to create their own CSS +- "How do I change the font size" +- "How do I change the font colour", or +- "How do I create two columns of text" -We had another interesting question from [John Chodacki]() +While CSS is uniquitous, and everywhere on the web, so it can be easy to change appearances of text. It is an additional learning point for learners, and something that can tip the balance and turn people away and back to systems they know. -> What about readers/consumers? But it seems like all the cool features of auto-updating tables, etc. can bring confusion for the reader ... unable to rely on stable info. Do you agree? Are there innovative ways to mitigate? +Some suggestions on addressing this were: -This problem can arise when you generate a graphic, or a model during data analysis, and iterate on it and improve it. Later one, you might want to compare your current graphic to your first one, or your first model to the current one. However, doing that is actually pretty hard, and involves some strong version control skills. We discussed an approach that [Miles McBain]() broached a few years ago called [journalr](https://ghcdn.rawgit.org/MilesMcBain/journalr/master/Journalling_tool_proposal.html) ([repo](https://github.com/MilesMcBain/journalr)). +- Showing students existing Rmarkdown HTML templates +- Providing simple CSS templates within an RMarkdown file for people to use +- Building better tools that guide people to create their own CSS -Ultimately, this is a hard problem to solve, and mimics a real life pen and paper notebook. [Roger Peng]() discussed an approach to this in [NSSDeviations]() (around episode 74 I think?), involving a manual approach +We had another interesting question from John Chodacki: +> What about readers/consumers? But it seems like all the cool features of auto-updating tables, etc. can bring confusion for the reader ... unable to rely on stable info. Do you agree? Are there innovative ways to mitigate? +This problem can arise when you generate a graphic, or a model during data analysis, and iterate on it and improve it. Later one, you might want to compare your current graphic to your first one, or your first model to the current one. However, doing that is actually pretty hard, and involves some strong version control skills. We discussed an approach that Miles McBain broached a few years ago called [journalr](https://ghcdn.rawgit.org/MilesMcBain/journalr/master/Journalling_tool_proposal.html) ([repo](https://github.com/MilesMcBain/journalr)). -How does RMarkdown relate to/streamline the academic publishing process? Analog: rOpenSci software review process Friendly entryways to open science & publishing : you're already doing it w/ code Process affects the outcome: Easier to share at the end because you're already sharing with yourself throughout Not just R! Examples from other languages (Jupyter [note]books) Open publishing in the wild Education: allisonhorst.github.io, datavizm20.classes.andrewheiss.com, tinystats.github.io/teacups-giraffes-and-statistics, ida.numbat.space Programs: openscapes.org Accompanying science pubs: ohi-science.org/betterscienceinlesstime +Ultimately, this is a hard problem to solve, and mimics a real life pen and paper notebook. Roger Peng discussed an approach to this in the NSSDeviations podcast (we believe in [Episode 74](http://nssdeviations.com/74-i-draw-the-line-at-fans)), involving a manual approach. -RMarkdown \<\> Word workflows: noamross.github.io/redoc Nick's experience writing his thesis in bookdown: how does it compare to latex? Incorporating RMarkdown sub-documents ("knit child"): OHI suppl. methods How to share documents using GitHub's gh-pages or doc/: R for Excel Users