Skip to content

Commit

Permalink
Merge pull request #40 from andersy005/add-feedstock-validation
Browse files Browse the repository at this point in the history
CI: add feedstock validation
  • Loading branch information
jbusecke authored May 8, 2024
2 parents dda5675 + e446eb3 commit ab2f76c
Show file tree
Hide file tree
Showing 2 changed files with 28 additions and 1 deletion.
27 changes: 27 additions & 0 deletions .github/workflows/validate-catalog.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
name: Catalog

on:
pull_request:
branches:
push:
branches:

concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true

jobs:
validate:
runs-on: ubuntu-latest
defaults:
run:
shell: bash -l {0}
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v4
with:
python-version: '3.10'
- name: validate feedstock entry
uses: leap-stc/data-catalog-actions/leap-catalog@main
with:
single-feedstock: "./feedstock/catalog.yaml"
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ To proceed with this step you will need assistance a memeber of the [LEAP Data a
- To deploy a recipe to Google Dataflow you have to trigger the "Deploy Recipes to Google Dataflow" with a single `recipe_id` as input.

- Once your recipe is run from a github workflow we assume that it is deployed to Google Dataflow and activate the final [copy stage](https://github.com/leap-stc/LEAP_template_feedstock/blob/55ee23ce0bc90f764d18bc34c58adccb5b38fc89/feedstock/recipe.py#L63). This happens automatically, but you have to make sure to edit the `'feedstock/catalog.yaml'` `url` entries for each `recipe_id`. This location will be the 'final' location of the data, and this is what gets passed to the the catalog in the next step!
- Once your recipe is run from a github workflow we assume that it is deployed to Google Dataflow and activate the final [copy stage](https://github.com/leap-stc/LEAP_template_feedstock/blob/55ee23ce0bc90f764d18bc34c58adccb5b38fc89/feedstock/recipe.py#L63). This happens automatically, but you have to make sure to edit the `'feedstock/catalog.yaml'` `url` entries for each `recipe_id`. This location will be the 'final' location of the data, and this is what gets passed to the the catalog in the next step!

>[!NOTE]
>By default the `'prune'` option is set to true. To build the final dataset you need to change that value [here](https://github.com/leap-stc/LEAP_template_feedstock/blob/55ee23ce0bc90f764d18bc34c58adccb5b38fc89/configs/config_dataflow.py#L7). **Particularly for large datasets make sure that you have finalized the entries in `'feedstock/catalog.yaml'`**, since the full build of the dataset can be slow and expensive - you want to avoid doing that again 😁
Expand Down

0 comments on commit ab2f76c

Please sign in to comment.