Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Source Freshness test -- how to exclude static tables that are sourced from a database #490

Closed
1 of 6 tasks
tuday2 opened this issue Sep 9, 2024 · 8 comments
Closed
1 of 6 tasks
Labels
bug Something isn't working triage

Comments

@tuday2
Copy link

tuday2 commented Sep 9, 2024

Describe the bug

I recently updated to 14.x and while I use source freshness for most of my tables, I have a table that is sourced from a database that has static data. How do I exclude this table from the source freshness test? I've tried to add it to my dbt_project_evaluator_exceptions seed however it doesn't seem to work.

Steps to reproduce

N/A

Expected results

N/A

Actual results

N/A

Screenshots and log output

N/A

System information

  - package: dbt-labs/dbt_utils
    version: 1.3.0
  - package: dbt-labs/dbt_project_evaluator
    version: 0.14.1

Which database are you using dbt with?

  • postgres
  • redshift
  • bigquery
  • snowflake
  • trino/starburst
  • other (specify: ____________)

The output of dbt --version:

Versionless (Cloud)
@tuday2 tuday2 added bug Something isn't working triage labels Sep 9, 2024
@b-per
Copy link
Collaborator

b-per commented Sep 9, 2024

Hi!

Using the dbt_project_evaluator_exceptions seed should work like for other tests.

  • could you confirm that the seed has been loaded before running the source freshness test?
  • if so, could you please share
    1. the output of your fct_sources_without_freshness table after the test failure?
    2. the seed configuration for that test

@tuday2
Copy link
Author

tuday2 commented Sep 9, 2024

I am using the seed for 3 other tests that are working correctly so I am certain its loading in the correct order:

Seed:

fct_name,column_name,id_to_exclude,comment
fct_root_models,child,dim_date,Date dimensions generated with dbt_utils date_spine
fct_root_models,child,dim_date_iso,Date dimensions generated with dbt_utils date_spine
fct_multiple_sources_joined,child,stg_%_unioned,Models called _unioned can union multiple sources
fct_sources_without_freshness,child,stg_XXX_historical_calls,Exclude static table that has legacy calls for XXX

(I also tried using the source name "legacy.XXX_historical_calls" in place of the staging table name and that gave the same error)

Output from FCT_SOURCES_WITHOUT_FRESHNESS table (1 row):
legacy.XXX_HISTORICAL_CALLS

@b-per
Copy link
Collaborator

b-per commented Sep 9, 2024

What about

fct_name,column_name,id_to_exclude,comment
...
fct_sources_without_freshness,resource_name,legacy.XXX_HISTORICAL_CALLS,Exclude static table that has legacy calls for XXX

The column name in the fct is not child for this table, it is resource_name

@tuday2
Copy link
Author

tuday2 commented Sep 9, 2024

That did it! thank you.

I had to use the exact case -- which normally I don't since I'm using Snowflake, however this is a reference to a source so that makes sense.

Is "resource_name" in the documentation somewhere? All the examples show "child" -- it may help to add a few more examples in the docs.

fct_sources_without_freshness,resource_name,legacy.XXX_HISTORICAL_CALLS,Exclude static table that has legacy calls for XXX

@b-per
Copy link
Collaborator

b-per commented Sep 9, 2024

resource_name itself is not documented as it would depend on each table, but we have this in the docs

column_name: the column name from fct_name we will be looking at to define exceptions

I might add another example in the docs where the column is not called child

@b-per b-per closed this as completed Sep 9, 2024
@tuday2
Copy link
Author

tuday2 commented Sep 9, 2024

What do you mean it would depend on each table? i literally put "resource_name" in there

@b-per
Copy link
Collaborator

b-per commented Sep 9, 2024

I mean that to add exceptions for the table fct_sources_without_freshness, there is only one filter, resource_name.

But any column returned from any of the fct_ tables can be used to set exceptions.

Taking fct_exposures_dependent_on_private_models for example, we return the columns exposure_name, parent_resource_name, parent_access and parent_resource_type

Any of those columns can be used in the seed if we wanted to exclude specific rows.

Does it make sense?

@tuday2
Copy link
Author

tuday2 commented Sep 9, 2024

Wow ok -- I never made the correlation between that and the name of the column in the evaluator table -- all makes sense now.

I honestly thought child was some default for a fact table --- just never clicked. I see now the name of the column in the evaluator tables is what you reference.

All good -- thanks again for the quick help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage
Projects
None yet
Development

No branches or pull requests

2 participants