Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip backfill for empty table #642

Closed
wants to merge 4 commits into from
Closed

Conversation

agedemenli
Copy link
Contributor

@agedemenli agedemenli commented Jan 30, 2025

No need to backfill an empty table. Backfill fails for tables without identity columns, but it shouldn't be necessary for empty tables. This causes meaningless errors like in the example given in the issue description.

Fixes: #636

@agedemenli agedemenli force-pushed the skip-backfill-empty-table branch from b4e737d to 4a2ac67 Compare January 31, 2025 11:49
@agedemenli agedemenli marked this pull request as ready for review January 31, 2025 14:27
@agedemenli agedemenli force-pushed the skip-backfill-empty-table branch from 43a2aac to efc03c8 Compare January 31, 2025 15:53
Copy link
Collaborator

@andrew-farries andrew-farries left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks good but I wonder if the ability to skip backfills on empty tables is worth losing the validation check for though?

Optimising for the empty-table case might not be the way to go; when a migration like this is applied to production the table typically won't be empty; and even if it is the same change can be made using a raw SQL migration to avoid the backfill anyway.

We risk allowing writing migrations that will work locally in a dev environment because there is no data in some tables, but then those same migrations might fail against production, without any validation to catch the problem.

Do you have a view here @exekias ?

// The table's column count reflects the drop of the new column
TableMustHaveColumnCount(t, db, schema, "users", 1)

// Insert a row to make sure we get a not possible error because there's no identity column
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Insert a row to make sure we get a not possible error because there's no identity column
// Insert a row to make sure we get a `backfill.NotPossible` error when the migration is restarted, because there's no identity column

// The table's column count reflects the drop of the new column
TableMustHaveColumnCount(t, db, schema, "users", 1)

// Insert a row to make sure we get a not possible error because there's no identity column
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Insert a row to make sure we get a not possible error because there's no identity column
// Insert a row to make sure we get a `backfill.NotPossible` error when the migration is restarted, because there's no identity column

@@ -150,7 +150,25 @@ func TestAddColumn(t *testing.T) {
},
},
},
wantStartErr: backfill.NotPossibleError{Table: "users"},
afterRollback: func(t *testing.T, db *sql.DB, schema string) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the test should have an afterStart part too - even if it's very simple.

},
},
},
afterRollback: func(t *testing.T, db *sql.DB, schema string) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the test should have an afterStart part too - even if it's very simple.

@agedemenli
Copy link
Contributor Author

@andrew-farries Agree on your concerns. I have tried to add the same empty table check to the validate phase, however, it doesn't look good in terms of readability and structure.

I'm ok with both merging this PR as is, or closing it.

@exekias
Copy link
Member

exekias commented Feb 3, 2025

Yeah, it seems risky to count on the number of rows calculation as it is. At the same time, from the bug description, I wonder if we can make assumptions on the table data to state whether a migration is valid. The same migration history could be replayed to a different database, and because of timing, fail this time because the table has rows. In general, we should be able to raise this validation error before touching the database or knowing about its state.

I would be inclined to leave this as a known issue (close the PR) rather than going for a fix that could harm the general use case, WDYT?

@andrew-farries
Copy link
Collaborator

I think we have consensus that this PR can be closed @agedemenli

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

No backfill needed for empty table
3 participants