-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new tables #27
Add new tables #27
Conversation
b223e53
to
b4d29b3
Compare
Makefile
Outdated
@@ -25,7 +25,8 @@ lint-bandit: ## Run bandit | |||
@echo "\n${BLUE}Running bandit...${NC}\n" | |||
@${POETRY_RUN} bandit -r ${PROJ} | |||
|
|||
lint-base: lint-flake8 lint-bandit ## Just run the linters without autolinting | |||
#lint-base: lint-flake8 lint-bandit ## Just run the linters without autolinting | |||
lint-base: lint-flake8 # TODO: Can we drop bandit? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@edmondop is there a compelling reason to include bandit in this project? I found it was giving irrelevant warnings and since this project doesn't deal in any non-public data, it's unclear to me why we care that much about security.
There are some test failures. I'll be on vacation for a week, so won't get around to finishing this until around December 20. |
@wjones127 - alright cool, have a nice vacation!! Let's finish this up and get it merged when you're back! |
@@ -35,7 +40,7 @@ def save_expected(case: TestCaseInfo, as_latest=False) -> None: | |||
# Need to ensure directory exists first | |||
os.makedirs(case.expected_root(version)) | |||
|
|||
df.toPandas().to_parquet(case.expected_path(version)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It turned out toPandas().to_parquet()
causes weird things to happen to timestamps, so better to stick to Spark here. Without this, I was able to eliminate the pandas dependency as well.
def create_nested_types(case: TestCaseInfo, spark: SparkSession): | ||
schema = types.StructType([ | ||
types.StructField( | ||
'pk', types.IntegerType() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
chispa doesn't support ignoring sort order when comparing tables that contain map types, so we have to add a pk
column that tests can sort on. I've added a note about this to the readme.
Description
Will rebase after #25 is merged.How was this patch tested?
Does this require an update to the documentation?