Skip to content

(encoding of confusion: unreferenced state group deletion either doesn't make sense, or can explode size of state_groups_state) #18219

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

reivilibre
Copy link
Contributor

This is intended as a demonstrative example for #18217 and is not a PR to merge.


The docstring

Deletes no longer referenced state groups and de-deltas any state
groups that reference them.

does not make sense; or rather, it's contradictory — if a state group is no longer referenced, then it follows that no state groups reference it and thus there should be no de-delta-ing to be done.

BUT if we edit it slightly, maybe the docstring starts to make sense?

Deletes state groups that are no longer referenced by events and de-deltas any state
groups that reference them.

frankly: I don't actually know if this is the intended meaning. But it sounds plausible.

If we go by that assumption, then we can write a test to demonstrate a situation in which state group deletion causes more rows to be added to state_groups_state than deleted.

@erikjohnston
Copy link
Member

Ah, I think the thinko here is that it assumes that all state groups before it have already been deleted. This would make sense given we generally purge all history before a certain time, however that doesn't take into account the fact we do naturally have unreferenced-by-events state groups between two referenced state groups.

Two options, either don't delete unreferenced-by-events state groups that:

  1. are referenced by other state groups; or
  2. reference other state groups.

The motivating reason why we delete unreferenced-by-events state groups is the case where you have long chains of state groups S_1, S_2, …, S_n where all but S_n are unreferenced-by-events. The number of state events in these chains can get large (especially when using the state compressor), so deleting everything and putting the full state in S_n can remove a lot of rows.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants