Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSHARP-5321: Optimize client-side projections to only fetch the parts of the document that are needed for the projection. #1589

Merged
merged 1 commit into from
Feb 6, 2025

Conversation

rstam
Copy link
Contributor

@rstam rstam commented Jan 9, 2025

No description provided.

@rstam rstam requested a review from a team as a code owner January 9, 2025 00:07
typeof(ClientSideProjectionSnippetsDeserializer<,,,,>)
];

public const int MaxNumberOfSnippets = 4; // could be expanded up to 16
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I stopped at 4 because I wanted to get farther in the review process before adding support for more snippets.

And we need to decide how many to add support for. The max would be 16 (based on the Func types available).

var deserializerGenericTypeDefinition = __deserializerGenericTypeDefinitions[snippetTypes.Length];
var deserializerGenericTypeArguments = snippetTypes.Append(projectionType).ToArray();
var deserializerType = deserializerGenericTypeDefinition.MakeGenericType(deserializerGenericTypeArguments);
return (IBsonSerializer)Activator.CreateInstance(deserializerType, [snippetDeserializers, projector]);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I decided it is better to use reflection ONCE to create the deserializer rather than to use reflection to call the projector delegate. That is because the projecter delegate has to be called for EVERY document returned from the server.

using MongoDB.Bson.Serialization;
using MongoDB.Bson.Serialization.Serializers;

namespace MongoDB.Driver
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A "snippet" is the name I have given to a piece of the projector expression that can be evaluated server-side. The rest of the projecction is done client-side.

When finding the snippets we try to keep as much of the work as possible server-side which more often than not will reduce the amount of data sent to the client.

@@ -35,6 +35,9 @@ public static bool IsInt32Constant(this AstExpression expression, out int value)
public static bool IsMaxInt32(this AstExpression expression)
=> expression.IsInt32Constant(out var value) && value == int.MaxValue;

public static bool IsRootVar(this AstExpression expression)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test was performed so many times that I created a method to centralize the implementation and simplify the places where the test is needed.

{
var rewrittenArg = (AstExpression)AstNodeReplacer.Replace(node.In, (node.As, _element));
var accumulatorExpression = AstExpression.UnaryAccumulator(AstUnaryAccumulatorOperator.Push, rewrittenArg);
var accumulatorFieldName = _accumulators.AddAccumulatorExpression(accumulatorExpression);
var root = AstExpression.Var("ROOT", isCurrent: true);
return AstExpression.GetField(root, accumulatorFieldName);
return AstExpression.GetField(AstExpression.RootVar, accumulatorFieldName);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AstExpression.RootVar is the way we should always refer to the variable for $$ROOT.

By using AstExpression.RootVar we ensure that the variable is created the same way every time (I found a few places where it was not). A secondary benefit is that there is only one instance of the RootVar that can be re-used over and over.

}

return pipeline;
return projectStage == null ?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

projectStage will be null when the client-side projection needs the whole document as input.

AssertStages(stages, Array.Empty<string>());
serializer.Should().BeOfType<ClientSideProjectionDeserializer<DocumentWithArray, int>>();
var stages = Translate(collection, queryable, out var outputSerializer);
AssertStages(stages, "{ $project : { _0 : '$A', _id : 0 } }");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of tests involving client-side projections had to be modified to pass now that we have optimized client-side projections.

.Select(x => Add(x.X, 1));

var stages = Translate(collection, queryable);
AssertStages(stages, "{ $project : { _0 : '$X', _id : 0 } }");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the constant 1 is not a snippet. It remained client-side.

.Select(x => Add(x.A.Sum(), 4));

var stages = Translate(collection, queryable);
AssertStages(stages, "{ $project : { _0 : { $sum : '$A' }, _id : 0 } }");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the sum is done server-side, avoiding fetching a potentially large array to the client.

@@ -38,7 +36,7 @@ public static string TranslateToDottedFieldName(this LambdaExpression fieldSelec
{
throw new ArgumentException($"ValueType '{parameterSerializer.ValueType.FullName}' of parameterSerializer does not match parameter type '{parameterExpression.Type.FullName}'.", nameof(parameterSerializer));
}
var parameterSymbol = context.CreateSymbolWithVarName(parameterExpression, varName: "ROOT", parameterSerializer, isCurrent: true);
var parameterSymbol = context.CreateRootSymbol(parameterExpression, parameterSerializer);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Creating a root symbol is done often enough that I have added a new CreateRootSymbol helper method to help ensure that the root symbol is always created correctly.

@rstam
Copy link
Contributor Author

rstam commented Jan 10, 2025

Some tests are failing on pre 4.4 servers. I'm investigating.

@rstam
Copy link
Contributor Author

rstam commented Jan 10, 2025

I have made minor changes so that this change works on older servers also.

@rstam
Copy link
Contributor Author

rstam commented Jan 10, 2025

Looks like now there are some other tests failing pre 4.4 servers... I'll probably just ending up skipping them on older servers since very soon 4.4 will be the minimum supported server version.

@@ -180,6 +182,7 @@ public void Select_with_enum_comparison_should_work(int i, string projectionAsSt
public void Comparison_of_enum_and_enum_with_mismatched_serializers_should_throw(
[Values(false, true)] bool enableClientSideProjections)
{
RequireServer.Check().Supports(Feature.FindProjectionExpressions);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm skipping most tests involving client-side projections on older servers.

The only tests that have special assertions for pre 4.4 servers are CSHARP-4763 and CSHARP-5321 (this one).

var stages = Translate(collection, queryable);
if (translationOptions.CompatibilityLevel == ServerVersion.Server42)
{
stages.Should().BeEmpty();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pre 4.4 servers don't support expressions in Find projections, so on those older servers we just fetch the whole document and pass the whole document to the client-side projection function.

Copy link
Member

@sanych-sun sanych-sun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@adelinowona
Copy link
Contributor

LGTM


namespace MongoDB.Driver.Linq.Linq3Implementation.Translators.ExpressionToAggregationExpressionTranslators
{
internal static class ClientSideProjectionTranslator
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that I renamed ClientSideProjectionRewriter to ClientSideProjectionTranslator to avoid a naming conflict.

LambdaExpression projectionLambda,
IBsonSerializer sourceSerializer)
{
var (snippets, rewrittenProjectionLamdba) = ClientSideProjectionRewriter.RewriteProjection(context, projectionLambda, sourceSerializer);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We now get the snippets and the rewritten lambda in one pass.


namespace MongoDB.Driver.Linq.Linq3Implementation.Translators.ExpressionToAggregationExpressionTranslators
{
internal class ClientSideProjectionRewriter: ExpressionVisitor
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that I renamed ClientSideProjectionSnippetsTranslator to ClientSideProjectionRewriter because it now rewrites the lambda at the same time that it is finding and translating the snippets.

return base.VisitMethodCall(node);
}

private Expression VisitThenBy(MethodCallExpression node)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before we could just skip over OrderBy and ThenBy but now that we are rewriting the lambda we have to visit them in a special way that doesn't split them across the client/server boundary.

ExpressionTranslationOptions translationOptions,
AstSimplifier simplifier)
bool forFind)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one parameter replaces projectionCreator and simplifier because the value of forFind implies the values for the other two so we only need this one parameter instead of three.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like a better name might be clearer but can't think of one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

withinFindProjection perhaps?

IBsonSerializer projectionSerializer;

var wireVersion = context.TranslationOptions.CompatibilityLevel.ToWireVersion();
if (forFind && !Feature.FindProjectionExpressions.IsSupported(wireVersion))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be the best place to check whether the projection is forFind and whether the CompatibilityLevel indicates that we can use aggregation expressions in find projections.

@rstam rstam requested a review from sanych-sun January 23, 2025 18:21
Copy link
Member

@damieng damieng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing major here to address but worth considering a few more tests.

Love the solution and approach!

var result = queryable.Single();
result.Should().Be(10);
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we should probably have more tests here?

  1. Multiple snippets
  2. Where the snippets have resolved out to a single get everything
  3. Where it simply can't project at all

@rstam rstam requested review from papafe and removed request for papafe January 28, 2025 16:42
Copy link
Member

@sanych-sun sanych-sun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@adelinowona adelinowona left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@rstam rstam force-pushed the csharp5321 branch 2 times, most recently from f71b616 to bddfa67 Compare February 6, 2025 01:41
@rstam rstam merged commit 836aa60 into mongodb:main Feb 6, 2025
1 check was pending
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants