-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: 📝 add start of the creating resources guide #810
base: main
Are you sure you want to change the base?
Conversation
…prout into docs/guide-for-managing-resources
…prout into docs/guide-for-managing-resources
…m/seedcase-project/seedcase-sprout into docs/guide-for-managing-resources
…m/seedcase-project/seedcase-sprout into docs/guide-for-managing-resources
…prout into docs/guide-for-managing-resources
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice 👍 Just some suggestions.
(bc I made the PR, I can't request changes or approve, only comment - so to show that I actually request changes, I have manually moved this PR to in progress on the board)
Co-authored-by: Signe Kirk Brødbæk <40836345+signekb@users.noreply.github.com>
…prout into docs/guide-for-managing-resources
Co-authored-by: Signe Kirk Brødbæk <40836345+signekb@users.noreply.github.com>
…m/seedcase-project/seedcase-sprout into docs/guide-for-managing-resources
@lwjohnst86 Is this ready for review or there more changes on the way? |
@signekb yup! |
I’ll start reviewing this now 🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice seeing this coming together. I have some suggestions and thoughts :)
(since I created this PR, I can only “Comment”)
docs/guide/resources.qmd
Outdated
In this case, we don't want to add anything else, so we'll write the | ||
text to the `README.md` file: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn’t this be a nice to show how this would be done, actually?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you mean. Could you expand?
…1099) ## Description To fit the #810 and the current [naming scheme](https://sprout.seedcase-project.org/docs/design/architecture/naming#actions). <!-- Select quick/in-depth as necessary --> This PR needs a quick review. ## Checklist - [X] Added or updated tests - [X] Ran `just run-all`
## Description This PR adds a `path_readme()` which was referred to in a TODO item in #810 This function will probably have to be rewritten a bit when we fully switch to the “local-first” approach, but I thought it made sense to add it now, so we don’t forget about it. <!-- Select quick/in-depth as necessary --> This PR needs a quick review. ## Checklist - [X] Added or updated tests - [X] Ran `just run-all` --------- Co-authored-by: Luke W. Johnston <lwjohnst86@users.noreply.github.com>
Co-authored-by: Signe Kirk Brødbæk <40836345+signekb@users.noreply.github.com>
…prout into docs/guide-for-managing-resources
@signekb annoyingly, I wasn't notified of your comments (or maybe I missed them, not sure). I can't re-assign you to review, just letting you know I've updated things! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lwjohnst86 Nice! Just some more comments (i.e., requested changes) :)
1. Create the properties for the resource, using the original raw data | ||
as a starting point and edit as needed. | ||
2. Create a folder to store the (processed) data resource in your | ||
package, as well as having a folder for the (tidy) raw data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
package, as well as having a folder for the (tidy) raw data. | |
package, as well as having a folder for the (tidy) batch data. |
Remove mentions of “raw data” within a resource.
@@ -22,6 +22,22 @@ that you have a record of the steps taken to clean and transform the | |||
data. | |||
::: | |||
|
|||
Putting your raw data into a data package makes it easier for yourself | |||
and others to use later one. So the steps you'll take to get this raw |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and others to use later one. So the steps you'll take to get this raw | |
and others to use later one. So the steps you'll take to get your |
Removing mentions of “raw data” within a resource.
Let's start with extracting the resource properties from the raw data. | ||
While this function tries to infer the data types in the raw data, it might not get it right. So, be sure to check the properties after using this function. It can also not infer things that are not in the data itself, like a description of what the data contains or the unit of the data. | ||
You'll start by creating the resource's properties. Before you can have | ||
data stored in a data package, it needs metadata (called properties) on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
data stored in a data package, it needs metadata (called properties) on | |
data stored in a data package, it needs properties (i.e., metadata) on |
I think it makes more sense to have them this way around so we consistently refer to properties and not metadata.
|
||
We've already create a package (using the steps from the [package | ||
guide](packages.qmd)), with the path set as the variable `package_path`: | ||
We assume you've already create a package (either by using the steps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We assume you've already create a package (either by using the steps | |
We assume you've already created a package (either by using the steps |
guide](packages.qmd)), with the path set as the variable `package_path`: | ||
We assume you've already create a package (either by using the steps | ||
from the [package guide](packages.qmd) or started making one for your | ||
own data), with the path set as the variable `package_path`: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
own data), with the path set as the variable `package_path`: | |
own data), with the path to the data package set as the variable `package_path`: |
this package, using the helper `path_resources()` function to give the | ||
correct path to the resources folder. The default behaviour of | ||
`path_resources()` is to use the current working directory, but for this | ||
guide you'll have to use the `path` argument to point to where the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this actually “we’ll” here? Bc if they’re following along locally, they should be able to use the cwd, right?
Next step is to set up the resource properties so that it gets checked | ||
and saved into the `datapackage.json` file. You can use the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Next step is to set up the resource properties so that it gets checked | |
and saved into the `datapackage.json` file. You can use the | |
The next step is to add the resource properties to the `datapackage.json` file. Before they are added, they will be checked to confirm that they are in the correct shape and that no required fields are missing. You can use the |
the first one in the package, so we can use `path_resource(1)`. | ||
Next step is to set up the resource properties so that it gets checked | ||
and saved into the `datapackage.json` file. You can use the | ||
`path_properties()` helper function to always give you the correct |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
`path_properties()` helper function to always give you the correct | |
path_properties()` helper function to give you the |
I feel like “always” is promising too much — what if they give it the wrong path, for instance.
resource_properties = sp.create_resource_properties( | ||
properties=resource_properties, | ||
path=package_path / sp.path_resource(1) | ||
# TODO: This function needs to be updated to write to data package. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# TODO: This function needs to be updated to write to data package. | |
# TODO: This function needs to be updated to write to datapackage.json |
pprint(sp.read_properties(package_path / sp.path_properties())) | ||
``` | ||
|
||
## Storing a backup of the raw data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
raw —> batch
Description
This adds the guide on creating and managing data resources. It is not complete as there are some things I don't know how they may work from an implementation perspective, but this is a good starting place for us.
Closes #759
This PR needs an in-depth review.