Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

database: Merge Cosmos DB containers #1215

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

mbarnes
Copy link
Collaborator

@mbarnes mbarnes commented Feb 2, 2025

What this PR does

This pull request contains all the breaking changes related to ARO-14170 - Merge Cosmos DB containers.

🚨 Give the ARO-HCP team a heads up before merging this! (auto-merge is disabled) 🚨

In addition to merging the "resources", "operations", and "subscriptions" Cosmos DB containers, this pull request also changes the structure of all Cosmos DB documents in this merged container.

A new typedDocument struct standardizes the structure of documents as follows:

  {
    "id":           <item-id>
    "partitionKey": <partition-key>
    "resourceType": <azure-resource-type>
    "properties": {
      ... content depends on "resourceType" ...
    }
    "_rid":  <system-generated>
    "_self": <system-generated>
    "_etag": <system-generated>
    "_ts":   <system-generated>
  }

A key insight is none of the system-generated fields are used outside of the database package. Therefore the data that the frontend/backend components care about can be encapsulated in a "properties" value, and the other document types no longer need to embed baseDocument.

DocumentProperties is an interface for types that can be plugged into the "properties" field, like ResourceDocument. The interface requires a GetValidTypes() method that returns a set of valid values for the "resourceType" field, used for validation when [un]marshalling.

One case worth noting is SubscriptionDocument was mostly already following this format, with its top-level "subscription" field defined as a arm.Subscription struct:

type SubscriptionDocument struct {                                                              
       baseDocument                                                                             
       Subscription *arm.Subscription `json:"subscription,omitempty"`                           
}                                

With this PR, arm.Subscription now implements the DocumentProperties interface and SubscriptionDocument is gone.

Also worth mentioning is DBClientIteratorItem has been redefined as a iter.Seq2 instead of a iter.Seq. Effectively this means the DBClientIterator.Items method now returns the Cosmos DB item ID separately from the item (document) itself.

To illustrate:

old style:  for doc := range iterator.Items(ctx)
new style:  for id, doc := range iterator.Items(ctx)

Jira: ARO-14170 - Merge Cosmos DB containers
Link to demo recording:

Special notes for your reviewer

I'm working on documenting our Cosmos DB data design based on these changes. The document is still a work-in-progress, but it's far enough along that it should give plenty of context for the changes here.

Here's a preview: Asynchronous Operations in the ARO-HCP Resource Provider

Copy link

github-actions bot commented Feb 3, 2025

Please rebase pull request.

Copy link

github-actions bot commented Feb 3, 2025

Please rebase pull request.

@mbarnes mbarnes force-pushed the merge-cosmos-db-containers branch from 9aa92e5 to ea349b4 Compare February 18, 2025 18:28
@mbarnes mbarnes force-pushed the merge-cosmos-db-containers branch from ea349b4 to 66bcada Compare February 19, 2025 14:15
@mbarnes
Copy link
Collaborator Author

mbarnes commented Feb 19, 2025

I tacked on yet another commit to undo #1189 - database: Add PartitionKeys container because Microsoft has released partial cross-partition query support in azcosmos v1.3.0. I tested and verified that it works well enough for our purpose. Adding here since it fits with the theme of dropping Cosmos containers.

Copy link

Please rebase pull request.

@mbarnes mbarnes force-pushed the merge-cosmos-db-containers branch from 66bcada to 6eacd13 Compare February 25, 2025 21:33
Copy link

Please rebase pull request.

@mbarnes mbarnes force-pushed the merge-cosmos-db-containers branch from 79607ad to 3e03624 Compare February 26, 2025 16:28
@mbarnes mbarnes force-pushed the merge-cosmos-db-containers branch 2 times, most recently from a22c4b0 to 8d4ecb5 Compare February 26, 2025 18:55
Copy link

Please rebase pull request.

Matthew Barnes added 6 commits February 27, 2025 15:56
Small hack to expose the "_ts" field from a Subscription document
in Cosmos DB for metrics reporting. This has a JSON tag of "-" so
it will never appear in HTTP response bodies.
This is a major breaking change to our Cosmos DB documents.

typedDocument standardizes the structure of documents as follows:

  {
    "id":           <item-id>
    "partitionKey": <partition-key>
    "resourceType": <azure-resource-type>
    "properties": {
      ...
    }
    "_rid":  <system-generated>
    "_self": <system-generated>
    "_etag": <system-generated>
    "_ts":   <system-generated>
  }

A key insight is none of the system-generated fields are used outside
of the database package. Therefore the data that the frontend/backend
components care about can be encapsulated in a "properties" value,
and the other document types no longer need to embed baseDocument.

DocumentProperties is an interface for types that can be plugged
into the "properties" field, like ResourceDocument. It requires a
GetValidTypes() method that returns a set of valid values for the
"resourceType" field, used for validation when [un]marshalling.

One case worth noting is SubscriptionDocument was mostly already
following this format, with its "subscription" field defined as a
arm.Subscription struct. Instead, arm.Subscription now implements
the DocumentProperties interface and SubscriptionDocument is gone.

Also worth mentioning is DBClientIterator has been redefined as a
iter.Seq2 instead of a iter.Seq. Effectively this means the Items
method now returns the Cosmos DB item ID separately from the item
(document) itself.

To illustrate:

    old style:  for doc := range iterator.Items()
    new style:  for id, doc := range iterator.Items()

Apologies for the size of this commit, but the changes here are
all intertwined.
This parameter is not used yet, this is just to get some API
changes out of the way.

The parameter type is azcosmos.PartitionKey instead of string for
type safety. If the partition key and item ID parameters were both
strings, there would be a risk of callers mixing up the arguments.
Discard the 'Operations' and 'Subscriptions' containers. This data
now lives in the 'Resources' container.

Doing this enables the use of transactional batch operations when
multiple Cosmos DB items of different types need to be created or
updated together.
In some cases documentation was moved above the method declaration
in the interface rather than above the cosmosDBClient method, which
is private to the database package.
azcosmos v1.3.0 partially supports cross-partition queries [1],
well enough for our purpose anyway.

The PartitionKeys container hack is no longer needed.

[1] https://github.com/Azure/azure-sdk-for-go/releases/tag/sdk/data/azcosmos/v1.3.0
@mbarnes mbarnes force-pushed the merge-cosmos-db-containers branch from 8d4ecb5 to 8e41d10 Compare February 27, 2025 21:08
Copy link
Collaborator

@DennisJWheeler DennisJWheeler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Does what it suggests, combines the database operators together. Standardizes the Returned Documents.
Switches to using GoMock.

@mbarnes mbarnes added the hold Do not merge yet label Feb 28, 2025
@mbarnes
Copy link
Collaborator Author

mbarnes commented Feb 28, 2025

Holding for a week to allow time for additional reviews as well as giving the ARO-HCP team a heads up.

Copy link
Collaborator

@SudoBrendan SudoBrendan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - just curious where the 7-day timeout on Operations came from (as opposed to some other value - like 6 hours, etc)?

//
// [1] https://github.com/Azure/azure-sdk-for-go/issues/18578
operationsPartitionKey = "workaround"
operationTimeToLive = 604800 // 7 days
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: is this arbitrary or an ARM requirement?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copied from ARO Classic. I haven't found any firm requirements about this from Microsoft.

Copy link
Collaborator Author

@mbarnes mbarnes Feb 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I had another look just now and found some new information by way of Stack Overflow.

Header Based Async Operation Timeout

This is a newer approach. The older approach in the era of ARO Classic was to specify an async timeout in the resource manifest. The maximum allowed timeout is 6 days, so my guess is 7 days was chosen for Cosmos to add some buffer.

If we want to re-evaluate, this might be a @bennerv question for when he gets back.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hold Do not merge yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants