Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: don't store duplicate events in the notice queue (#1372)
When a new event is emitted, check if there is already an exact duplicate (the same observer, same event type, and same event snapshot) in the storage from a deferral in a previous run. If a duplicate does exist, then don't store a new notice and snapshot. There are performance implications: * For every event emitted, there's a full iteration through the notices queue. If there are no deferred events, then the notices queue will be empty, so in the majority of cases this should only be one additional storage query (to get the empty result). If the deferral queue is very large, then this may be noticeable, although the point of the change is to reduce the length of the queue, because there are already issues when it's large. * For each notice in the queue (again: normally none) the snapshot is loaded (a storage query; currently all snapshots are quite small in size) for any events that have matching observers and kinds. If the queue had a lot of events of the same type, it's likely the observer would match, and if the snapshots were different (for example, many `secret-changed` events but for different secrets) then there is a cost to doing the comparison. * In cases where the queue currently builds up with a lot of duplicates, there will be a significant performance improvement, because only one of the (notice+snapshot) events will be processed each run. There is also a (deliberate) behaviour change that does impact event ordering. For example, consider this sequence of events: 1. `config-changed` deferred 2. `secret-changed` deferred 3. `config-changed` Currently, this would result in: 1. `ConfigChangedEvent` 2. `ConfigChangedEvent`, `SecretChangedEvent` 3. `ConfigChangedEvent`, `SecretChangedEvent`, `ConfigChangedEvent` With this change, this would result in: 1. `ConfigChangedEvent` 2. `ConfigChangedEvent`, `SecretChangedEvent` 3. `ConfigChangedEvent`, `SecretChangedEvent` More generally, there could currently be duplicate (notice+snapshot) events intermixed throughout the queue, and each run they will be re-emitted in the order in which they originally occurred. With this change, they will be re-emitted in the order in which they originally occurred, *except if they have already emitted this run*. The particularly noticeable change is that the Juju event that triggered the run may not be the last event (if it was a duplicate of one in the queue). We could potentially do this differently - for example, updating the sequence so that when a duplicate occurs it moves the event to the end of the queue (by dropping and adding the notice+snapshot, or explicitly setting the `sequence` field for SQL and by just reordering the list for Juju). This would add complexity and have a performance penalty, however, and it seems more correct to have the original order. For unit tests: * Harness: this change is incompatible with some Harness use - specifically, if the test code emits the same event more than once, where it's deferred at least once, there is no `reemit()` call like there would be in production, and the test expects the handler to be called more than once. For this reason, the skipping is disabled for Harness. * Scenario: Scenario is more explicit with deferred events, so if you want to have had the 'skipping' behaviour occur before the event you are `ctx.run`ing then you need to manage that in the list of deferred events passed into the State. We could add a consistency check to alert if there are duplicates in that list (this would be easier to do when the Scenario code is in this repo). However, the Scenario behaviour does still change: if the event is deferred in the `ctx.run` and there's already a match in the state's deferred list, then the new (deferred) event does not get added, and the output state doesn't change (which is what we want). We get this behaviour automatically because Scenario mimics the runtime behaviour more closely, actually running the framework emitting/reemitting. So: Scenario tests are both safer, and can be used to match the new behaviour. This can be tested manually with a charm that optionally defers events, e.g. with the code below. <details> <summary>charm.py and charmcraft.yaml content</summary> ```python class NoticeQueueCharm(ops.CharmBase): def __init__(self, framework: ops.Framework): super().__init__(framework) framework.observe(self.on.config_changed, self._on_config_changed) framework.observe(self.on.secret_changed, self._on_secret_changed) def _on_config_changed(self, event): logger.info("Running config-changed") if self.config.get("secretopt"): secret = self.model.get_secret(id=self.config["secretopt"]) # Get the content so we are 'subscribed' to the updates. secret.get_content() if self.config.get("secretopt2"): secret = self.model.get_secret(id=self.config["secretopt2"]) # Get the content so we are 'subscribed' to the updates. secret.get_content() if self.config.get("opt", "").startswith("defer"): event.defer() def _on_secret_changed(self, event): logger.info("Running secret-changed") if self.config.get("opt", "").startswith("defer"): event.defer() ``` ```yaml config: options: opt: description: dummy option to trigger config-changed secretopt: type: secret description: a user secret secretopt2: type: secret description: a user secret ``` If you want to see the queue while you're doing this, you can use code like this: ```python store = self.framework._storage for event_path, observer_path, method_name in store.notices(None): handle = ops.Handle.from_path(event_path) snapshot_data = store.load_snapshot(handle.path) logger.info( "event_path: %s, observer_path: %s, method_name: %s, snapshot data: %r", event_path, observer_path, method_name, snapshot_data, ) ``` </details> If you set `opt` to anything not starting with "defer" then you should get a `config-changed` event every time. If you set it to something starting with "defer", then it will run exactly once each time you set the config (remember to change the value, or Juju will skip the event) - with ops@main you'll instead get a `config-changed` event for every time that you change the config, every time you change it (ie. the queue will build up). You can also check it with an event that has a snapshot, such as `secret-changed`. If the config is set to defer, every time you change the content of the first secret, you'll get one `secret-changed` event (but with ops@main, each time you'll get multiple, depending on how many times you've done it). If you also change the second secret, you'll get two `secret-changed` events, one for each secret (because the snapshots differ). You can intermix the different events and should always have exactly zero or one (event type+snapshot) in the queue. If you change the `opt` value back to something not starting with "defer", then you should see all the events complete and have an empty queue. <details> <summary>Scenario test that shows the behaviour</summary> ```python import ops from ops import testing class MyCharm(ops.CharmBase): def __init__(self, framework): super().__init__(framework) framework.observe(self.on.secret_changed, self._on_sc) framework.observe(self.on.update_status, self._on_us) def _on_us(self, event): print("update-status", event) def _on_sc(self, event): print("secret-changed", event) event.defer() ctx = testing.Context(MyCharm, meta={"name": "foo"}) secret = testing.Secret({"foo": "bar"}) devent1 = ctx.on.update_status().deferred(MyCharm._on_us) devent2 = ctx.on.secret_changed(secret).deferred(MyCharm._on_sc) state_in = testing.State(secrets={secret}, deferred=[devent1, devent2]) state_out = ctx.run(ctx.on.secret_changed(secret), state_in) assert state_out.unit_status == testing.UnknownStatus() print(state_out.deferred) ``` Note that this requires fixing a small bug in Secret snapshoting ([PR](tonyandrewmeyer/ops-scenario#7)). That's unrelated to this change - it also occurs if you use deferred secrets in main. </details> Fixes #935 --------- Co-authored-by: Dima Tisnek <dima.tisnek@canonical.com> Co-authored-by: Ben Hoyt <benhoyt@gmail.com> Co-authored-by: Dima Tisnek <dimaqq@gmail.com>
- Loading branch information