diff --git a/materials/sections/author-credit.Rmd b/materials/sections/author-credit.Rmd index b5ae7b2f..c4b163d5 100644 --- a/materials/sections/author-credit.Rmd +++ b/materials/sections/author-credit.Rmd @@ -1,10 +1,10 @@ ### Authorship and Credit -Navigating issues of intellectual property and credit can be a challenge, particularly for early career researchers. Open communication is critical to avoiding misunderstandings and conflicts. Talk to your coauthors and collaborators about authorship, credit, and data sharing early and often. This is particularly important when working with new collaborators and across lab groups or disciplines which may have divergent views on authorship and data sharing. If you feel uncomfortable talking about issues surrounding credit or intellectual property, seek the advice or assistance of a mentor to support you in having these important conversations. +Navigating issues of intellectual property and credit can be a challenge, particularly for early career researchers. Open communication is critical to avoiding misunderstandings and conflicts. Talk to your coauthors and collaborators about authorship, credit, and data sharing **early and often**. This is particularly important when working with new collaborators and across lab groups or disciplines which may have divergent views on authorship and data sharing. If you feel uncomfortable talking about issues surrounding credit or intellectual property, seek the advice or assistance of a mentor to support you in having these important conversations. -The “Publication” section of the Ecological Society of America’s Code of Ethics is a useful starting point for discussions about co-authorship, as are the International Committee of Medical Journal Editors guidelines for authorship and contribution. You should also check guidelines published by the journal(s) to which you anticipate submitting your work. +The “Publication” section of the [Ecological Society of America’s Code of Ethics](https://www.esa.org/about/code-of-ethics/) is a useful starting point for discussions about co-authorship, as are the [International Committee of Medical Journal Editors guidelines](http://www.icmje.org/recommendations/browse/roles-and-responsibilities/defining-the-role-of-authors-and-contributors.html) for authorship and contribution. You should also check guidelines published by the journal(s) to which you anticipate submitting your work. -For collaborative research projects, develop an authorship agreement for your group early in the project and refer to it for each product. This authorship agreement from the Arctic Data Center provides a useful template. Some key questions to consider as you are working with your team to develop the agreement: +For collaborative research projects, develop an [authorship agreement](http://training.arcticdata.io/2020-10-arctic/files/template-authorship-policy-ADC-training.docx) for your group early in the project and refer to it for each product. This authorship agreement from the Arctic Data Center provides a useful template. Some key questions to consider as you are working with your team to develop the agreement: - What roles do we anticipate contributors will play? e.g., the CASRAI CRediT project identifies 14 distinct roles: - Conceptualization @@ -21,7 +21,7 @@ For collaborative research projects, develop an authorship agreement for your gr - Visualization - Writing – original draft - Writing – review & editing -- What are our criteria for authorship? (See the ICMJE guidelines for potential criteria) +- What are our criteria for authorship? (See the [ICMJE guidelines](http://www.icmje.org/recommendations/browse/roles-and-responsibilities/defining-the-role-of-authors-and-contributors.html) for potential criteria) - Will we extend the opportunity for authorship to all group members on every paper or product? - Do we want to have an opt in or opt out policy? (In an opt out policy, all group members are considered authors from the outset and must request removal from the paper if they don’t want think they meet the criteria for authorship) - Who has the authority to make decisions about authorship? Lead author? PI? Group? @@ -34,4 +34,4 @@ For collaborative research projects, develop an authorship agreement for your gr ### Data Policies -Learn more about data management planning and other data management skills via DataOne and check out this example data policy from the Arctic Data Center. +Learn more about [data management planning](https://dataoneorg.github.io/Education/bp_step/plan/) and other [data management skills](https://dataoneorg.github.io/Education/) via DataOne and check out this [example data policy](http://training.arcticdata.io/2020-10-arctic/files/template-data-policy-ADC-training.docx) from the Arctic Data Center. diff --git a/materials/sections/meeting-facilitation.Rmd b/materials/sections/meeting-facilitation.Rmd index 6b226d0a..31b5fee5 100644 --- a/materials/sections/meeting-facilitation.Rmd +++ b/materials/sections/meeting-facilitation.Rmd @@ -2,7 +2,7 @@ As humans, we are constantly coming together in groups for some common purpose. At our core, we are social animals, and our exceptional abilities to connect and collaborate may be among the main things that distinguish us as a species. So if meetings are just opportunities for groups of humans to connect and collaborate, why do so many of us find them so painful? Probably we are going about this the wrong way! -![](images/phdcomics_101514s_meetings.gif) +![phd comics meetings - http://phdcomics.com/comics/archive.php?comicid=1752](images/phdcomics_101514s_meetings.gif) Luckily there are simple things you can do to make your meetings more enjoyable and effective for everyone. They come down to thoughtful meeting design and effective facilitation. In general: @@ -19,11 +19,12 @@ Luckily there are simple things you can do to make your meetings more enjoyable - Progress (during and after the meeting) is reviewed, tracked, and communicated. - Everyone is invited to review progress / success and feed learning back in to improve future meetings and ongoing work of the team. + #### Designing for long term success What are we trying to achieve when we design and facilitate a team science meeting? -First, we are working to build and sustain a coalition of the willing – a group of people united in shared purpose, willing to contribute. +First, we are working to build and sustain a [coalition of the willing](https://blogs.scientificamerican.com/observations/the-secret-sauce-for-environmental-problem-solving/) – a group of people united in shared purpose, willing to contribute. Second, we want to foster creativity. When faced with complex problems without obvious solutions, we need to invite new ideas from all quarters and create the conditions where innovation and unconventional thinking are welcomed. We cannot know ahead of time where good ideas will come from. @@ -33,15 +34,16 @@ Finally, in most team science endeavors, we can and should also be designing for #### The value of planning -Good meeting design starts with understanding your purpose and objectives. Once you understand why you need to meet (your overarching goal) and what you want to accomplish (the specific outcomes you are driving toward), you can turn to how you will accomplish your purpose (i.e. the agenda design) and who will play what roles. A good rule of thumb is to allow 2-3x as much time to plan a meeting as its duration. +Good meeting design starts with understanding your purpose and objectives. Once you understand **why** you need to meet (your overarching goal) and **what** you want to accomplish (the specific outcomes you are driving toward), you can turn to how you will accomplish your purpose (i.e. the agenda design) and **who** will play what roles. A good rule of thumb is to allow 2-3x as much time to plan a meeting as its duration. -![](images/why-what-how-who.jpg) +![why, what, how, who](images/why-what-how-who.jpg) If you are at the start of a collaborative project, your initial meeting(s) should seek to establish (1) refined questions, goals and objectives; (2) an engaged team with a clear sense of the overall vision and next steps; (3) a clear work plan with timelines, roles and responsibilities; and (4) a plan for communication and coordination of work between meetings (and ideally, the dates for your next meeting). Take a team approach to both planning and running meetings. #### Meeting facilitation + When it’s your turn to facilitate a meeting, give some thought to where facilitation will likely be needed. Where do you anticipate potential logjams or sticking points? Where have you seen the group get stuck in the past? Are there personalities or relationships in the group that may require mediation, e.g., individuals that tend to dominate conversation or take the group off course? How will you encourage connection and creativity with this particular group? Who can you count on to help you ensure the meeting runs smoothly? #### Sharing the load of effective meeting facilitation @@ -64,7 +66,7 @@ As you get to know your team members, you can start to match people to these dif Team science benefits from the diverse perspectives and ideas of researchers coming together to generate novel ideas. The flow of new ideas is supported by rich, inclusive dialogue. In contrast to debate or discussion, dialogue allows groups to recognize the limits on their own and others’ individual perspectives and to strive for more coherent thought. Dialogue becomes a container for collective thinking and exploration – a process that can take teams in directions not imagined or planned. -In discussion, different views are presented and defended, and this may provide a useful analysis of the whole situation. In dialogue, all views are treated as equally valid, and different views are presented as a means toward discovering a new view. Participants listen to understand one another, not to win. In a skillful discussion, decisions are made. In a dialogue, complex issues are explored and shared meaning is created. Both are critical to the collaborative process, and the more artfully a group can move between these two forms of discourse (and out of less productive debate and polite discussion) according to what is needed, the more effective the group will be. +In *discussion*, different views are presented and defended, and this may provide a useful analysis of the whole situation. In *dialogue*, all views are treated as equally valid, and different views are presented as a means toward discovering a new view. Participants listen to understand one another, not to win. In a *skillful discussion*, decisions are made. In a *dialogue*, complex issues are explored and shared meaning is created. Both are critical to the collaborative process, and the more artfully a group can move between these two forms of discourse (and out of less productive *debate* and *polite discussion*) according to what is needed, the more effective the group will be. ![](images/senge-placeholder.jpg) @@ -89,9 +91,9 @@ Some facilitators will offer tips for dealing with “difficult” personalities #### The Four Player Model A useful tool for reading the dynamics of the room comes from David Kantor, a family psychologist who has studied interpersonal communication for many years. Kantor analyzed the speech acts (statements or questions) in thousands of conversations among family members and other groups. He concluded that in any conversation, there are four primary speech acts that people will make: Move, Follow, Oppose, or Bystand. -When someone moves, they are taking the lead, making a suggestion, proposal or positing an idea. If someone chooses to follow, they affirm or agree with that idea. Someone else may choose to oppose the idea by offering criticism or counter-evidence or withholding support. Others may bystand by neither following nor opposing, but instead adding information that comes from a different point of view. +When someone **moves**, they are taking the lead, making a suggestion, proposal or positing an idea. If someone chooses to **follow**, they affirm or agree with that idea. Someone else may choose to **oppose** the idea by offering criticism or counter-evidence or withholding support. Others may **bystand** by neither following nor opposing, but instead adding information that comes from a different point of view. -![](images/ideas_form_Nancy_Margulies_World_Cafe.jpg) + As the conversation unfolds, other moves will be made, and the participants may play different roles, though most of us feel more comfortable with some types of speech acts than others. All four types are valuable and necessary for moving a group forward. Conversations that don’t involve a balance of all four can indicate a counterproductive group dynamic. @@ -117,22 +119,24 @@ Don’t Panic: Half the battle is recognizing that you are entering the groan zo Be creative and empathetic when you design your agenda. Think about your participants and what is going to help all of them participate fully and creatively. Here are a few techniques and microstructures we have found useful in the divergent and emergent stages of a process. -![](images/world_cafe_guidelines_Nancy_Margulies_World_Cafe.jpg) + + - Round robins, e.g. to get starting positions out on the table and hear from everyone -- 1,2,4,all to allow everyone’s participation and elevate themes and key insights (format goes from individual to small group to whole group discussion) -- Sticky note brainstorming + clustering -- Rotating stations -- World Cafe conversations -- Panel discussion or User Experience Fishbowl to explicitly draw out and contrast different expert perspectives +- [1,2,4,all](http://www.liberatingstructures.com/1-1-2-4-all/) to allow everyone’s participation and elevate themes and key insights (format goes from individual to small group to whole group discussion) +- [Sticky note brainstorming](https://www.designkit.org/methods/28) + [clustering](https://www.designkit.org/methods/30) +- [Rotating stations](http://www.liberatingstructures.com/11-shift-share/) +- [World Cafe conversations](http://www.liberatingstructures.com/17-conversation-cafe/) +- Panel discussion or [User Experience Fishbowl](http://www.liberatingstructures.com/18-users-experience-fishbowl/) to explicitly draw out and contrast different expert perspectives - Parallel breakout groups (be sure to allow time for report out and suggest a structure so that process can be efficient and informative) - Scribing to capture participant’s viewpoints - Encouraging and drawing out people, mirroring and validating what they say - See other ideas in the Creative Problem Solving (CPS): Divergent Tools Quick Reference from Omni Skills linked below in Resources -![](images/Sticky_notes_on_the_wall_of_the_Wikimedia_Foundation_office,_2010-10-26.jpg) -Image from Ragesoss on Wikimedia Commons Creative Commons – Attribution 3.0 +![Image from Ragesoss on Wikimedia Commons Creative Commons – Attribution 3.0 ](images/Sticky_notes_on_the_wall_of_the_Wikimedia_Foundation_office,_2010-10-26.jpg) + + #### Making thinking visible @@ -149,24 +153,25 @@ Consider using: When capturing notes, try to use people’s own words; if necessary ask them to distill long or complex points into a headline you can capture. Invite them to offer corrections if you didn’t capture what they meant. + #### Techniques to support convergent thinking (i.e., getting to agreement) Coming to a clear decision can sometimes be the hardest work of a collaboration. Use tools and frameworks to help your group converge in its thinking. -- Clarify your decisionmaking process up front. Are you trying to reach full consensus within this group or does majority rule? Or is this a decision that will be made by an individual or small group, based on the larger group’s input? If you are seeking consensus but can’t reach it, what’s your fallback plan? Clarifying how you will make decisions can help you avoid one of the worst pitfalls of collaboration - never making them. -- Co-create conceptual frameworks to organize ideas and draw connections. Constructing a visual model together helps to reveal the different ways individuals think about the problem, which can build a shared understanding of it and unlock new ways of addressing it. Joint model building supports metacognition, raising awareness of individual assumptions, gaps, and biases. -- Make specific proposals. Get the ideas on paper. Give each idea a headline and write a short abstract that distills the problem, why it’s important, and your proposed solution. Review each other’s ideas. Identify areas of confusion or uncertainty. Clarify language. -- Specify your criteria for convergence. A simple two-axis plot can help a group decide how to focus its effort. For example, as shown in the figure below, you might assess each idea for its feasibility and potential impact, and then focus on those scoring high in both. Defining what you mean by feasibility and impact before your start scoring is essential and can be another source of collaborative learning. +- **Clarify your decisionmaking process up front**. Are you trying to reach full consensus within this group or does majority rule? Or is this a decision that will be made by an individual or small group, based on the larger group’s input? If you are seeking consensus but can’t reach it, what’s your fallback plan? Clarifying how you will make decisions can help you avoid one of the worst pitfalls of collaboration - never making them. +- **Co-create conceptual frameworks** to organize ideas and draw connections. Constructing a visual model together helps to reveal the different ways individuals think about the problem, which can build a shared understanding of it and unlock new ways of addressing it. Joint model building supports metacognition, raising awareness of individual assumptions, gaps, and biases. +- **Make specific proposals**. Get the ideas on paper. Give each idea a headline and write a short abstract that distills the problem, why it’s important, and your proposed solution. Review each other’s ideas. Identify areas of confusion or uncertainty. Clarify language. +- **Specify your criteria for convergence**. A simple two-axis plot can help a group decide how to focus its effort. For example, as shown in the figure below, you might assess each idea for its feasibility and potential impact, and then focus on those scoring high in both. Defining what you mean by feasibility and impact before your start scoring is essential and can be another source of collaborative learning. ![](images/feasibility_v_impact_matrix.jpg) -- Assess support for proposals, e.g. by using a gradient of agreement +- **Assess support for proposals**, e.g. by using a gradient of agreement + +![Kaner Gradient of Agreement-Adapted (TRG, 2017) and (Hughes, 2017)](images/Kaner-Gradient-of-Agreement-Adapted-TRG-2017-and-Hughes-2017.png) -![](images/Kaner-Gradient-of-Agreement-Adapted-TRG-2017-and-Hughes-2017.png) -Kaner Gradient of Agreement-Adapted (TRG, 2017) and (Hughes, 2017) -- Prioritize by identifying your Top five or using Dot voting -- Be realistic about constraints / barriers using Gut check +- **Prioritize** by identifying your [Top five](https://www.designkit.org/methods/15) or using [Dot voting](https://en.wikipedia.org/wiki/Dot-voting) +- **Be realistic about constraints / barriers** using [Gut check](https://www.designkit.org/methods/42) - See other ideas in the Creative Problem Solving (CPS): Convergent Tools Quick Reference from Omni Skills linked below in Resources diff --git a/materials/sections/tools-skills.Rmd b/materials/sections/tools-skills.Rmd index 1613b16f..0e88a5d0 100644 --- a/materials/sections/tools-skills.Rmd +++ b/materials/sections/tools-skills.Rmd @@ -4,22 +4,21 @@ There are many reasons why it is essential to make your science reproducible. We will refer you to the references at the end of this section to learn more about open and reproducible science and how the necessity of openness is a cornerstone of the integrity and efficacy of the scientific research process. You will learn more about the tools and techniques you can use to make your data analysis in the Data Science section of this book, here we are introducing an overview of skills and tools that will enable you to collaborate in a reproducible manner. -In this section, we are focusing on providing an overview of why making your work reproducible will empower you to iterate quickly, integrate new information more easily, scale your analysis to larger datasets, and better collaborate by receiving feedback and contributions from others, as well as enable your future self to reuse and build on your own work. +In this section, we are focusing on providing an overview of why making your work reproducible will empower you to iterate quickly, integrate new information more easily, scale your analysis to larger data sets, and better collaborate by receiving feedback and contributions from others, as well as enable your future self to reuse and build on your own work. ### Collaborative tools **Goal**: create a **_space for you to collaborate_** and **_centralize the information_** of your projects -During your postdoctoral project(s), you will be working most likely work with collaborators who will be part of multiple organizations. And since you are coming from different institutions, you will therefore most likely using different tools that others might not have access to. The overarching goal is to _centralize the information to make it easy for you and others to find everything in one place quickly. _Here are a few tools we think will be essential to foster your collaboration as a distributed team: +During your postdoctoral project(s), you will be working most likely work with collaborators who will be part of multiple organizations. And since you are coming from different institutions, you will therefore most likely using different tools that others might not have access to. The overarching goal is to centralize the information to make it easy for you and others to find everything in one place quickly. Here are a few tools we think will be essential to foster your collaboration as a distributed team: * **Document sharing**: we recommend setting up a shared drive (e.g., Box, Google Drive, ...) where you can centralize the documents that your group will gather and create. This is meant to be a solution that everyone on the team (independently of their institution) can access. Each solution as its pros and cons and mostly it will depend what your collaborators and yourself are used to. From our experience at NCEASE, Google Drive has a large user base and can be setup with very open sharing settings. This can be useful if you are collaborating across institutions. See [here](https://docs.google.com/document/d/1gemd7IB5OKVPnrTIaXECYKeMoNJ6MGUNSh1ehB8Frsk/edit) for NCEAS' recommendations on how to use `Google Drive` in a collaborative setup. * **Team Communication**: The tools you will be using will depend on how you are collaborating synchronously or not (see Chapter on Virtual collaboration for more). \ For asynchronous communication, we recommend setting up a **mailing list** (e.g. `Google Groups`) that allows you to reach all working group participants at once. We recommend adding a specific tag to the subject line of the mailing list to enable team members to set up filters for your project in their inbox. For faster communication, while working at the same time, **chat rooms** have proven to be a very efficient way to exchange ideas quickly and answer questions collaboratively way. As an example of potential tools, `Slack` has become a very popular tool as it is pretty easy to learn (see [here](https://slack.com/resources/slack-101/what-is-slack) for an introduction). \ -_Meetings, in-person or virtual, will also be an important way to communicate and work with your collaborators. Please refer to the Collaboration section of this book to learn more about how to make them efficient. \ -See the Virtual Collaboration section to learn more about virtual meetings and how to run them efficiently. -* **Coding together**: We recommend using version control tools such as `git` and `GitHub` (or similar such as GitLab) to share code. This is a great way to share and document your work for your collaborators, including your future self. Those tools are designed to track changes and who has implemented them. It creates a history of changes that you can navigate back to retrieve previous versions. Note that those tools also have features to enable communication and track features and bugs and discuss changes (see [here](https://guides.github.com/features/issues/) for Github issues as an example). In the Data Science training part of this onboarding material, you will have a comprehensive introduction to those tools and how to best use them. +Meetings, in-person or virtual, will also be an important way to communicate and work with your collaborators. Please refer to the Collaboration section of this book to learn more about how to run them efficiently. \ +* **Coding together**: We recommend using version control tools such as `git` and `GitHub` (or similar such as `GitLab`) to share code. This is a great way to share and document your work for your collaborators, including your future self. Those tools are designed to track changes and who has implemented them. It creates a history of changes that you can navigate back to retrieve previous versions. Note that those tools also have features to enable communication and track features and bugs, as well as discuss modifications to your code (see [here](https://guides.github.com/features/issues/) for GitHub issues, as an example). In the Data Science part of this onboarding material, you will have a comprehensive introduction to those tools and how to best use them. **Few criteria to ask yourself to help to pick the right tool:** @@ -30,19 +29,19 @@ See the Virtual Collaboration section to learn more about virtual meetings and h ### Data Management -You are starting at the right place since you will be working with NEON data. NEON data sets are well documented, archived, and made publicly available in a curated data repository (https://data.neonscience.org/). However, you might have to combine this data with other types of data that might be less documented and harder to discover or obtain. +You are starting at the right place as you will be working with NEON data! NEON data sets are well documented, archived, and made publicly available in a curated data repository (https://data.neonscience.org/). However, you might have to combine this data with other types of data that could be less documented and harder to discover or obtain. -We thus strongly recommend planning ahead and develop a** Data Management Plan** as you are starting your project. This will help you to plan for: +We thus strongly recommend planning ahead and develop a **Data Management Plan** as you are starting your project. This will help you to plan for: -* What are the various steps you need to go through before you can start your analysis? -* How much data are you going to collect, aggregate? +* What are the various steps needed before you can start your analysis? +* How much data will need to be collected and aggregated together? * Who is going to do what? * Estimate how long it will take to organize and process the data (tip: double your estimate and you will be still underestimating; [Hofstadter's law]([https://en.m.wikipedia.org/wiki/Hofstadter%27s_law](https://en.m.wikipedia.org/wiki/Hofstadter%27s_law))) -* Are there any legal constraints associated with acquiring, using, and sharing project data? E.g. survey data -* At the end of your project, where do you plan to archive your products (data produced, codes, ...)? -You will learn more about data repository and the archiving your products later in the Book +* Are there any legal constraints associated with acquiring, using, and sharing project data? E.g. survey data involving personal information +* At the end of your project, where would your products (data produced, codes, ...) be archived? +You will learn more about data repository and the archiving your products later in this book -```{r data-life-cycle, out.width='80%', fig.align="center", fig.cap="Credits: DataONE data management; Friedrich Recknagel and William K. Michener. “Ecological Informatics”, 2017)", echo=FALSE} +```{r data-life-cycle, out.width='90%', fig.align="center", fig.cap='Credits: DataONE data management; Friedrich Recknagel and William K. Michener. "Ecological Informatics", 2017', echo=FALSE} knitr::include_graphics(here::here("images","tools-skills-datalifecycle.png")) ``` @@ -51,18 +50,18 @@ DataONE has developed several great [documents](https://dataoneorg.github.io/Edu A few more thoughts related to managing data in a collaborative setup: 1. Centralize the management of your data \ -Try to avoid having data sets spread among laptops or other personal computers; this makes it difficult for other team members to redo a particular analysis and can become a nightmare to know which version of the data was used for a specific analysis. We recommend asking your institution if there are servers or cloud services available to you and use those tools to centralize your data management and making sure that all your collaborators access the same version of the data sets using the same path. +Try to avoid having data sets spread among laptops or other personal computers; this makes it difficult for other team members to redo a particular analysis and can become a nightmare to know which version of the data was used for a specific analysis. We recommend asking your institution if there are servers or cloud services available to you and use those tools to centralize your data management. This will also make sure that all your collaborators will be able to access the same version of the data using the same path. 2. Develop naming conventions \ For files and folder: - 1. Avoid spaces (use underscores or dashes) - 2. Avoid punctuation or special characters - 3. Try to leverage alphabetical order (e.g. start with dates: 2020-05-08) - 4. Use descriptive naming (metadata) - 5. Use folders to structure/organize content - 6. Keep it simple + a. Avoid spaces (use underscores or dashes) + b. Avoid punctuation or special characters + c. Try to leverage alphabetical order (e.g. start with dates: 2020-05-08) + d. Use descriptive naming (metadata) + e. Use folders to structure/organize content + f. Keep it simple 3. Make it programmatically useful - 7. Useful to select files (Wildcard *, regular expression) - 8. But don’t forget Humans need to read file names too + a. Useful to select files (Wildcard *, regular expression) + b. But don’t forget Humans need to read file names too **Example:** @@ -75,28 +74,28 @@ Which filename would be the most useful? Answer: `2020-05-08_light-sensor-1_plot-1.csv` because the date will sort the file in order by default and the consistent usage of `-` and `_` will let you break the filename into useful information. -The most important is to be **_consistent_** among collaborators and over time. To know more about this topic, here is a good [reference](https://speakerdeck.com/jennybc/how-to-name-files) on this topic from Jenny Bryan (RStudio). +The most important is to be **_consistent_** among collaborators and over time. To know more about this topic, here is a good [reference](https://speakerdeck.com/jennybc/how-to-name-files) from Jenny Bryan (RStudio). ### Scientific programming for reproducible research -To make your data-riven research reproducible, it is important to develop analytical workflows that will be relying on programming to accomplish the different tasks necessary to go from the raw data to the results (figures, new data, publications, ...) of your analysis. Scripting languages, even better open ones such as R and Python, are well-suited for scientists to develop reproducible scientific workflows. Those scripting languages provide a large ecosystem of libraries (also referred to as packages or modules) that are ready to be leverage to conduct analysis and modeling. The Data Science Section of this onboarding document will introduce you to how to use R and other tools to develop such a workflow. +To make your data-riven research reproducible, it is important to develop scientific workflows that will be relying on programming to accomplish the necessary tasks to go from the raw data to the results (figures, new data, publications, ...) of your analysis. Scripting languages, even better open ones such as R and Python, are well-suited for scientists to develop reproducible scientific workflows. Those scripting languages provide a large ecosystem of libraries (also referred to as packages or modules) that are ready to be leveraged to conduct analysis and modeling. The Data Science Section of this onboarding document will introduce you to how to use `R` and other tools to develop such a workflow. -```{r tidy-workflow, out.width='60%', fig.align="center", fig.cap="Workflow example using the `tidyverse`. Note the program box around the workflow and the iterative nature of the analytical process described. _Source: R for Data Science _",echo=FALSE} +```{r tidy-workflow, out.width='80%', fig.align="center", fig.cap="Workflow example using the `tidyverse`. Note the program box around the workflow and the iterative nature of the analytical process described. _Source: R for Data Science _",echo=FALSE} knitr::include_graphics(here::here("images","tidy-workflow.png")) ``` -We recommend shying away from spreadsheets as an analytical tool, as well as Graphical User Interfaces (GUI) where you need to click on buttons to do your analysis. Although convenient for data exploration, GUI will limit the reproducibility and the scaling you can accomplish in your analysis as human intervention is needed at every step. Spreadsheets can be useful to store tabular data, but it is recommended to script their analysis, as copy-pasting and references to cells are prone to mistake ([see Reinhart and Rogof example](http://www.peri.umass.edu/fileadmin/pdf/working_papers/working_papers_301-350/WP322.pdf), hard to track, and difficult to scale to larger data. In addition, auto-formatting (number, date, character, ...) can introduce modifications to your data ([One in five genetics papers contains errors thanks to Microsoft Excel](https://www.sciencemag.org/news/2016/08/one-five-genetics-papers-contains-errors-thanks-microsoft-excel). +We recommend shying away from spreadsheets as an analytical tool, as well as Graphical User Interfaces (GUI) where you need to click on buttons to do your analysis. Although convenient for data exploration, GUI will limit the reproducibility and the scaling you can accomplish in your analysis as human intervention is needed at every step. Spreadsheets can be useful to store tabular data, but it is recommended to script their analysis, as copy-pasting and references to cells are prone to mistake ([see Reinhart and Rogof example](http://www.peri.umass.edu/fileadmin/pdf/working_papers/working_papers_301-350/WP322.pdf). It is also difficult to track changes and to scale your analysis. In addition, auto-formatting (number, date, character, ...) can silently introduce modifications to your data ([One in five genetics papers contains errors thanks to Microsoft Excel](https://www.sciencemag.org/news/2016/08/one-five-genetics-papers-contains-errors-thanks-microsoft-excel). #### Scripting languages -Compared to other programming languages (such as C), scripting languages are not required to be compiled to be executable. One consequence is that, generally, scripts will execute more slowly than a compiled executable program, because they need an interpreter. However, their more natural language oriented syntax makes scripting easier to learn and use. +Compared to other programming languages (such as `C`, `fortran`, ...), scripting languages are not required to be compiled to be executable. One consequence is that, generally, scripts will execute more slowly than a compiled executable program, because they need an interpreter. However, the more natural language oriented syntax of scripts make them easier to learn and use. In addition, numerous libraries are available to streamline scientific analysis. **Don’t start coding without planning!** -It is important to stress that scientists write scripts to help them to investigate scientific question(s). Therefore it is important that scripting does not drive our analysis and thinking. We strongly recommend you take the time to plan the steps you need to accomplish to conduct your analysis. Developing a scientific workflow of your analysis will then allow you to develop a pseudo-code that will help you to narrow down the tasks that need to be accomplished to move forward your analysis. +It is important to stress that scientists write scripts to help them to investigate scientific question(s). Therefore scripting should not drive our analysis and thinking. We strongly recommend you take the time to plan ahead all the steps you need to conduct your analysis. Developing such a scientific workflow will help you to narrow down the tasks that are needed to move forward your analysis. **Structure of a script** @@ -117,17 +116,17 @@ From the top to the bottom of your script: This will allow you to inform your collaborators (but also your future self!) about the tasks your script accomplishes * Use variables and constants instead of repeating values in different places of the code. This will let you update those values more easily * Choose descriptive names for your variables and functions, not generic ones. If you store a list of files, do not use `x` for the variable name, use instead `files`. Even better use `input_files` if you are listing the files you are importing. -* Be consistent in terms of style (`input_files`, `inputFiles`, …) used to name variables and functions. Just pick one and stick to it! +* Be consistent in terms of style (`input_files`, `inputFiles`,...) used to name variables and functions. Just pick one and stick to it! * `keep it simple, stupid` ([KISS](https://en.wikipedia.org/wiki/KISS_principle)). Do not create overly complicated or nested statements. Break your tasks in several simple lines of code instead of embedding a lot of executions in one (complicated line). It will save you time while debugging and make your code more readable to others -* Go modular! Break down tasks into small code fragments such as functions. It will make your code reusable for you and others (if well documented :) ). Keep functions simple; they should only implement one or few (related) tasks -* Don’t Repeat Yourself ([DRY](https://en.wikipedia.org/wiki/Don%27t_repeat_yourself)). If you start copy/pasting part of your code changing a few parameters => write a function and call it several times with different parameters. Add flow control such as loops and conditions. It will be easier to debug, change and maintain +* Go modular! Break down tasks into small code fragments such as functions or code chunks. It will make your code reusable for you and others (if well documented ). Keep functions simple; they should only implement one or few (related) tasks +* `Don’t Repeat Yourself` ([DRY](https://en.wikipedia.org/wiki/Don%27t_repeat_yourself)). If you start copy/pasting part of your code changing a few parameters => write a function and call it several times with different parameters. Add flow control such as loops and conditions. It will be easier to debug, change and maintain * Test your code. Test your code against values you would expect or computed with another software. Try hedge cases, such as NA, negative values, …. * Iterate with small steps, implement few changes at a time to your code. Test, fix, and move forward! -We hope this overview section has raised your interest to learn more about practices and tools for reproducible and collaborative research. +We hope this overview section has raised your interest in learning more about data science practices and tools for reproducible and collaborative research. The Data Science section of this onboarding material will introduce you to use the `R` programming language to develop scientific reproducible research. -### References +### Further reading Here are a few selected publications to help you to learn more about these topics. diff --git a/materials/sections/virtual-collaboration.Rmd b/materials/sections/virtual-collaboration.Rmd index 6d3693dd..40d8243c 100644 --- a/materials/sections/virtual-collaboration.Rmd +++ b/materials/sections/virtual-collaboration.Rmd @@ -107,9 +107,8 @@ It is hard to remotely recreate in-person experiences, including hallway convers ## Further reading +This chapter builds from [Openscapes](https://openscapes.org) resources, including Cabunoc Mayes et al 2020: [3 lessons from remote meetings we’re taking back to the office](https://opensource.com/article/20/6/remote-meetings)* - -* [3 lessons from remote meetings we’re taking back to the office](https://opensource.com/article/20/6/remote-meetings) - Cabunoc Mayes et al. 2020 * [Best Practices for Virtual Participation in Meetings: Experiences from Synthesis Centers](https://www.jstor.org/stable/10.2307/bullecosociamer.98.1.57) – Hampton et al. 2017.[ ](https://www.jstor.org/stable/10.2307/bullecosociamer.98.1.57) * [POP everything! Strategic planning in 30 seconds or less](https://suzannehawkes.com/2010/04/09/pop-everything/) – Hawkes 2010 * [The art of gathering](https://www.priyaparker.com/) - Parker 2018, see also [Unlocking Us podcast episode](https://brenebrown.com/podcast/brene-with-priya-parker-on-the-art-of-gathering/ ) diff --git a/materials/sections/working-inclusively.Rmd b/materials/sections/working-inclusively.Rmd index 5f190f63..ae5cba42 100644 --- a/materials/sections/working-inclusively.Rmd +++ b/materials/sections/working-inclusively.Rmd @@ -4,9 +4,9 @@ We hope NEON postdocs will join us in making a lifelong commitment to helping fo Navigating a career path in science and research can be very challenging for individuals, especially when this is done alone, without a strong support network, and/or as someone from an underrepresented group. We thus recommend that you implement a few simple strategies that can help overcome both normal and extraordinary obstacles that you may encounter as you continue along this path: -- *Consider* ways to recognize bias (including personal bias) in science and the workplace -- *Enroll* in JEDI trainings or workshops (e.g., one provided by ADVANCE GEO) to gain skills and perspectives that are widely relevant to any career path -- *Review collections of resources* about JEDI in academic research (e.g., resources compiled by LTER and NCEAS), with a view towards actions that individuals and their extended communities can take to develop the strengths, skills, and relationships necessary to create fully inclusive learning and work environments. Categories of information in the LTER and NCEAS collections that may be especially relevant to NEON postdocs may include: +- *Consider* ways to [recognize](https://serc.carleton.edu/advancegeo/resources/bias.html) bias (including personal bias) in science and the workplace +- *Enroll* in JEDI trainings or workshops (e.g., [one provided by ADVANCE GEO](https://serc.carleton.edu/advancegeo/workshops/index.html)) to gain skills and perspectives that are widely relevant to any career path +- *Review collections of resources* about JEDI in academic research (e.g., resources compiled by LTER and NCEAS), with a view towards actions that individuals and their extended communities can take to develop the strengths, skills, and relationships necessary to create fully inclusive learning and work environments. Categories of information in the [LTER](https://lternet.edu/network-organization/diversity-resources/) and [NCEAS](https://www.nceas.ucsb.edu/diversity-inclusion) collections that may be especially relevant to NEON postdocs may include: - LTER - Understanding the problem in academia and ecology - Creating an inclusive environment @@ -21,11 +21,9 @@ Navigating a career path in science and research can be very challenging for ind - Anti-racism resources - Gender equity - Resources for parents -- *Ask* your advisor and/or your institution about its Code of Conduct for the workspace, including related to meetings, conferences, socializing, and more (e.g., see NCEAS’ Code of Conduct for its working group meetings). If one is not yet in place, request it to be produced (or co-produced with your input). -- *Set SMART professional goals for yourself* on 1, 2, and 5 year timescales, and be sure these goals address how justice, equity, diversity and inclusion can and will manifest through your work. -- *Form a personal board* including at least 3-4 people in your life who you trust and admire, who can provide diverse perspectives and experiences that can enlighten you, and who have demonstrated that they’re willing to provide advice when you need it. This board will prove valuable in many ways, including that you can seek their guidance and feedback related to JEDI issues. +- *Ask* your advisor and/or your institution about its Code of Conduct for the workspace, including related to meetings, conferences, socializing, and more (e.g., see NCEAS’ [Code of Conduct](https://live-ncea-ucsb-edu-v01.pantheonsite.io/sites/default/files/2020-02/NCEAS_Code-of-Conduct_2019.pdf) for its working group meetings). If one is not yet in place, request it to be produced (or co-produced with your input). +- *Set [SMART](https://en.wikipedia.org/wiki/SMART_criteria) professional goals for yourself* on 1, 2, and 5 year timescales, and be sure these goals address how justice, equity, diversity and inclusion can and will manifest through your work. +- *Form a [personal board](https://www.sciencemag.org/careers/2020/03/personal-board-directors-can-help-you-navigate-challenges-your-career)* including at least 3-4 people in your life who you trust and admire, who can provide diverse perspectives and experiences that can enlighten you, and who have demonstrated that they’re willing to provide advice when you need it. This board will prove valuable in many ways, including that you can seek their guidance and feedback related to JEDI issues. - *Find new ways to offer support and mentorship opportunities to others*, including any colleagues who may not yet have a strong network of professional supporters and advisors. In addition to ‘paying it forward,’ practicing mentorship will help you develop leadership skills. -![](images/team.jpg) - -Open access image from https://pxhere.com/en/photo/1434697 +![Open access image from https://pxhere.com/en/photo/1434697](images/team.jpg)