-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introducing break the glass as a principle #40
Conversation
Signed-off-by: Graeme Hay <grmhay@gmail.com>
Signed-off-by: Graeme Hay <grmhay@gmail.com>
Signed-off-by: Graeme Hay <grmhay@gmail.com>
Signed-off-by: Graeme Hay <grmhay@gmail.com>
Signed-off-by: Graeme Hay <grmhay@gmail.com>
Signed-off-by: Graeme Hay <grmhay@gmail.com>
Signed-off-by: Graeme Hay <grmhay@gmail.com>
@grmhay thanks for this PR! To get more discussion on this, you may want to:
|
My gut response is perhaps principle 3 could somehow address that agents should be able to pull the manifests from the source WHENEVER NEEDED (not just when a CI job runs, or as you said limited to uptime of your source of truth). We removed the "break glass" glossary item temporarily, because:
What about something like "whenever needed" to principle 3? 3. **Pulled Automatically**
- Software agents automatically pull the desired state declarations from the source.
+ Software agents automatically pull the desired state declarations from the source <whenever needed>. Then link "whenever needed" to a glossary item about source uptime, which could then link to your "Intermediate State Store" item and perhaps some version of the former "break glass" glossary item? |
My two cents: I don't think "break glass" is something that should be a principal. This is something that can be a "best practice" or "operating model" or a "white paper" Break glass is too specific for these principals, which is meant to be open ended. |
Ah really interesting idea @grmhay! You make some really good points. Couple things
|
Also cross-liking older discussion open-gitops/project#86 |
Just revisiting this. @grmhay Would you want to close this and open a "best practice" or "white paper" PR? |
Hi
White paper is probably the best label as I think best practice might be a
bit presumptuous of the reader’s situation.
Graeme
…On Sat, Feb 19, 2022 at 4:08 PM Christian Hernandez < ***@***.***> wrote:
Just revisiting this. @grmhay <https://github.com/grmhay> Would you want
to close this and open a "best practice" or "white paper" PR?
—
Reply to this email directly, view it on GitHub
<#40 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABWZDOJDOULS3627Y2VLL6LU4APFHANCNFSM5GKQWP2A>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Any updates or progress with this? |
Asked again in this Slack thread. |
- ## Intermediate State Store | ||
A system for storing a copy of the declarations that are mastered in the State Store. This system's purpose is intended to bridge the gap in availability between that of the State Store and the expected availability to make configuration changes to the Software System. The Intermediate State Store will offer an availability the same as or near enough to that of the users' expectations to update configuration in the Software System. | ||
Where an Intermediate State Store is used, Reconciliation is used between the State Store and the Intermediate State Store and then again between the Intermediate State Store and the Software System. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this part has gotten (and will become) much more relevant with people adopting OCI also for things other than container images. Or as @monadic put it:
GitOps is a transaction system with a Git backend and OCI cache
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we moving into leader election / consensus territory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where do you see leader election or consensus palying a role here? There is exactly one SSoT state store and N intermediate state stores. In simple setups you'll have exactly one intermediate state store. In more complex scenarios you might have more than one (e.g. one per environment). Synchronization always happens unidirectional from state store into the intermediate state stores. The intermediate state stores are independent of each other and don't need to be synchronized laterally.
+======+ +==============+
+---->| iss1 |<--------| gitops agent |
+=============+ | +======+ +==============+
| state store |------+
+=============+ | +======+ +==============+
+---->| iss2 |<--------| gitops agent |
| +======+ +==============+
|
| +======+ +==============+
+---->| iss3 |<--------| gitops agent |
+======+ +==============+
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO what's important to point out here is to make clear the requirements the intermediate state store has to satisfy in order for this whole setup to satisfy the GitOps principles:
- It must version all the artifacts
- Artifacts, once stored, must be immutable
- It must retain a version history
- ## Feedback | ||
|
||
Open GitOps follows [control-theory](https://en.wikipedia.org/wiki/Control_theory) and operates in a closed-loop. In control theory, feedback represents how previous attempts to apply a desired state have affected the actual state. For example if the desired state requires more resources than exist in a system, the software agent may make attempts to add resources, to automatically rollback to a previous version, or to send alerts to human operators. | ||
|
||
- ## Break the Glass | ||
The process of editing the Intermediate State Store directly in the event that a configuration update needs to be made to the Software System but the State Store is unavailable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suppose an example of why the state store is not available might be helpful to understand the potential situation better. Also a note that this should really only be a very rare exception and that proper authorization must be in place (e.g. multi-party authorization).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
State store can be anything? s3? cassandra?
There has been recent - and excellent! - discussion on the reviews/comments in this PR. We have a discussion item for this here open-gitops/project#86. Could one of you please summarize the above conversation and move that into that the discussion linked here? That way we can keep this conversation alive even while I'll now close this PR. BTW, We'll be using those discussion topics as the basis for this KubeCon EU OpenGitOps project meeting in Amsterdam. |
We (representing Morgan Stanley) believe that the situation where the source of truth for desired state (e.g. github.com or a git-equivalent that an enterprise may run - but recognizing there are central and decentralized approaches for storing desired state) is less available than your users' expected SLA for making configuration changes is being left by the community as an issue for the implementer to overcome.
Put succinctly, if (in our example) Github is unavailable and you want to make changes to your System State, there should be one approach and a set of tooling to allow reconciliation after the fact. A further example exists in disconnected systems (e.g. Kubernetes on a ship) where the system may be disconnected from the store where the desired state resides. How then would the system state be updated in an emergency and then reconciled with the desired state?
This will both harm adoption of gitops and is inefficient as I believe we shared a common challenge that we can solve once within the project.
The first step, as this project has so well established, is a glossary of terms to allow us to describe the problem and a draft principle to add. I have included these in this PR.