-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HCP Cluster Resource Deletion Cascading Subscription Delete #920
Conversation
d76b6ac
to
05f8144
Compare
Please rebase pull request. |
Add "externalID" and "internalID" parameters so the returned document is a minimum valid OperationDocument for writing.
The operation item must now be created in the database prior to calling ExposeOperation. ExposeOperation does all its processing in a database update callback. This is because there is an increasing number of cases where we create an implicit async operation with no visible status endpoint. Calling ExposeOperation makes an implicit async operation explicit, with a status endpoint for ARM to poll. Hence the rename. The tradeoff is explicit asyncrhonous operations now require two database operations (create and update) but it helps make the RP logic cleaner. This could possibly be mitigated in the future by using Cosmos DB's transactional batch operations, but it's gonna take some serious refactoring to get there.
CancelActiveOperation marks the status of any active operation on the resource as canceled.
Will be reusing DeleteResource for subscription deletion. Add database bookkeeping for the resource and any child resources. This includes creating implicit operations for each resource being deleted. The caller may then expose the returned operation ID.
By my read of the Subscription Lifecycle API Reference [1], we should favor 200 OK over 201 Created when creating or updating a subscription. [1] https://github.com/cloud-and-ai-microsoft/resource-provider-contract/blob/master/v1.0/subscription-lifecycle-api-reference.md#response
Called when a subscription is deleted. The method is idempotent in case of multiple subscription PUT requests.
Don't count on OperationID being set in OperationDocuments. Implicit async operations will not have this field set. Get the subscription ID from ExternalID instead.
05f8144
to
962581a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM just a few small question and a thought
@@ -94,13 +95,15 @@ type OperationDocument struct { | |||
Error *arm.CloudErrorBody `json:"error,omitempty"` | |||
} | |||
|
|||
func NewOperationDocument(request OperationRequest) *OperationDocument { | |||
func NewOperationDocument(request OperationRequest, externalID *arm.ResourceID, internalID ocm.InternalID) *OperationDocument { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would we consider splitting this functionality into NewImplicitOperationDocument
and NewExplicitOperationDocument
? This would help cement the concept and make it more visible. We could even extend the OperationDocument Type with Implicit/Explicit but that might be overkill and I don't have a good feeling if that's worth it.
What do people think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In terms of the operation document, what distinguishes explicit from implicit operations is certain fields being populated: namely TenantID
, ClientID
and most importantly OperationID
. I'm all for making the code clearer, though I'm not sure if a separate document type is the way I'd go. The backend pod, for example, doesn't care about this explicit vs implicit distinction and treats all operations the same. So I wouldn't want the backend to have to handle two document types.
Maybe an OperationDocument.IsExplicit
method, that just returns whether the OperationID
field is non-empty, would be sufficent? (Or OperationDocument.IsExposed
to align with Frontend.ExposeOperation... I don't know, I'm playing fast and loose with terminology here.)
In any case, I don't know that we have a use case for such a method at the moment but I'll keep this in the back of my mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fair. I don't have my head around it yet but hopefully the "right" answer appears at some stage :D
@@ -507,9 +507,18 @@ func (f *Frontend) ArmResourceCreateOrUpdate(writer http.ResponseWriter, request | |||
} | |||
} | |||
|
|||
operationDoc, err := f.StartOperation(writer, request, doc, operationRequest) | |||
operationDoc := database.NewOperationDocument(operationRequest, doc.Key, doc.InternalID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Side question: Any objection or issue if I make a PR to change Key
-> ResourceId
in the ResourceDocument
Type just for readability?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this has bugged me for awhile. I've held off for the sake of backward-compatibility, but since I'm working on a much larger breaking change with our database code please feel free. (Or if you don't, I probably will.)
What this PR does
When a subscription state changes to "Deleted", the RP now triggers a deletion of all HCP clusters under the subscription as per the Resource Provider Contract.
Behind the scenes, this introduces the concept of "implicit" and "explicit" async operations:
Frontend.ExposeOperation
method enriches the "Operation" item with information necessary to make the status endpoint accessible to ARM, and adds appropriate async headers to anhttp.ResponseWriter
.Importantly, the backend pod does not distinguish between implicit and explicit async operations. The sole purpose of an "implicit" async operation at the moment, which is only used for deletions, is for the backend to delete the "Resource" item in Cosmos DB after the actual resource is deleted.
Jira: ARO-13321 - Implement Cascading Subscription Deletion
Special notes for your reviewer