This repository represents work to produce figures for the paper "Revealing Earth Science code and data-use practices using the Throughput Graph Database" (in revision), intended for a GSA publication. The repository contains data obtained from Throughput, and from manual analysis of code repositories indexed within Throughput. All data transformations for each figure are included in separate functions within the figures
folder. The file gsa_paper_figures.R
loads data and produces figures as scalable vector graphic (SVG) files within the svgplots
folder.
This project is an open project, and contributions are welcome from any individual. All contributors to this project are bound by a code of conduct. Please review and follow this code of conduct as part of your contribution.
Issues and bug reports are always welcome. Code clean-up, and feature additions can be done either through pull requests to project forks or branches.
All products of the Throughput Annotation Project are licensed under an MIT License unless otherwise noted.
To generate figures, simply clone the repository locally and run the script in gsa_paper_figures.R
. The file uses the pacman
package to manage package loading and installation. As such you may need to install pacman
prior to running the script. You can do this either by installing the package from the R console:
install.packages("pacman")
Once you've done this you can either open the gsa_paper_figures.R
file and run the code, or you can call source("gsa_paper_figures.R")
from the console.
Once the script has finished running you will see three figures in the svgplots
folder that can be viewed or edited using image editing tools such as Inkscape.
Th project uses X core information, manages it and passes our some stuff.
This project was developed using R with elements by MW, ML, SD and SG. SG compiled all code elements into a single repository structure and applied a common coding style to the files. It was executed using R v4.1.2 on Ubuntu 20.04.3 LTS.
Data required for this project was obtained from the Throughput Annotation Database using the Throughput API. Additional data was produced by manually assigning data use typologies to a subset of the total set of code repositories associated with Geology and Paleontology data archives within the Registry of Resource Repositories (https://re3data.org).
This project generates three SVG figures that are intended to reproduce the figures in the paper itself.
- Does the code run independent of the creator's home computer?
- Are the figures generated similar to those within the paper?