Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOLR-17351: Decompose filestore "get file" API #3047

Merged

Conversation

gerlowskija
Copy link
Contributor

https://issues.apache.org/jira/browse/SOLR-17351

Description

Solr's filestore "get file" API actually covers 4 or 5 distinct operations. Depending on provided query parameters, it can:

  • return filestore entry metadata (when meta=true is specified)
  • instruct the receiving Solr node to pull a file from another node's filestore and cache it locally (when getFrom=someOtherNode is specified)
  • instruct the receiving Solr node to push its cached copy of a file out to all other Solr nodes (when sync=true is specified)
  • return raw file contents
  • return JSON-ified file contents

This makes the code for this endpoint somewhat complex. It also makes it hard to model and parse the response on the client side.

Solution

This PR splits up the "get file" endpoint into a number of different APIs. Specifically:

  • metadata-fetching has been moved out to the endpoint, GET/api/cluster/filestore/metadata/some/path.txt
  • Filestore commands such as pushing/pulling files are now available at: POST /api/cluster/filestore/commands
  • Support for "JSON-ified" file data has been dropped in this PR (but will be retained but deprecated in the eventual 9.x backport)

These divisions allow us to generate SolrRequest/SolrResponse classes representing these APIs, meaning that SolrJ users no longer need to use GenericSolrRequest/GenericSolrResponse.

Tests

Some rewriting in TestDistribFileStore. Existing package and filestore tests continue to pass.

Checklist

Please review the following and check all that apply:

  • I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • I have created a Jira issue and added the issue ID to my pull request title.
  • I have given Solr maintainers access to contribute to my PR branch. (optional but recommended, not available for branches on forks living under an organisation)
  • I have developed this patch against the main branch.
  • I have run ./gradlew check.
  • I have added tests for my changes.
  • I have added documentation for the Reference Guide

Sorry, something went wrong.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Tests pass
Identical to the pre-existing`/cluster/files/a/b.txt?meta=true`, but in
its own endpoint.

Does NOT remove the existing endpoint or switch tests over to use it,
but we will want to do this eventually on `main`.
Identical to the pre-existing`/node/files/a/b.txt`, but in its own
endpoint.  Does NOT remove the existing endpoint or switch tests over
to use it.  Those can be handled in a separate commit to enable this one
to be easily backported later to 9x if desired.

This also doesn't enable codegen for the new API, though I think the
blockers are sufficiently cleared up now to enable that if desired.
Still no modifications to use the API internally, or removal of the old
endpoint, in order to ease backporting.
This should probably be pulled into a separate PR/branch, and modified
to address all relevant endpoints and other loose ends of SOLR-17562.
TestDistribFileStore fails with the changes in this commit.  The changes
themselves are fine, but it fails because of a pre-existing *ahem*
feature in SolrClient where SolrParams are encoded as form-params
instead of query-params in some circumstances.

I could probably hack around this for the moment (e.g. by using a
builder setter to ensure 'getFrom' is sent as a true query param).

But I've decided to halt progress on this branch and break some other
pieces out into their own PRs, while I send some questions out to the
community on this front.
Ensures v2 POST query-params aren't put in a 'form' body.
@gerlowskija
Copy link
Contributor Author

Still TODO: ref-guide updates, additional testing.

@gerlowskija gerlowskija changed the title Solr 17351 break up getfile api main SOLR-17351: Decompose filestore "get file" API Jan 20, 2025
@epugh
Copy link
Contributor

epugh commented Jan 20, 2025

Does this impact #3031 ? Since I am using cc.getFileStore I don't think so, but wanted to check.

@gerlowskija
Copy link
Contributor Author

That PR uses the Java interface, and not the HTTP APIs directly, so it should be fine afaict.

@POST
@Operation(
summary =
"Pushes a file to other nodes, or pulls a file from other nodes in the Solr cluster.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the fact that one thing does a push or pull suggest a issue to fix? https://en.wikipedia.org/wiki/List_of_Doctor_Dolittle_characters#Pushmi-Pullyu

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Syncs a file via either pushing or pulling across the nodes in the Solr cluster." ???

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This ends up being moot since I take your "split up the endpoint" suggestion on L116)

String getFrom,
@Parameter(
description =
"If true, triggers syncing for this file across all nodes in the filestore")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it's own end point? Or, do we somehow eliminate the need for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, I guess splitting these into separate endpoints is a little more in keeping with the currently documented convention. Done 👍

do we somehow eliminate the need for this?

In terms of eliminating the need entirely: I'm open to that if you have ideas? As I understand the current filestore impl, these two commands are essentially internal APIs that Solr relies on to ensure filestore entries are (eventually) present on all nodes. So we'd need some other way of doing that?

Relatedly, it'd be really nice if we had some way to flag the "internal-ness" of these APIs. We want the code-generation to cover them, so that solr-core itself has nice classes to use. But ideally something could signal to end-users that they should never be using these APIs themselves. A different package name? Some sort of Javadoc tag? Maybe that's worth its own JIRA ticket to spur some brainstorming...

@@ -194,6 +301,48 @@ public SolrJerseyResponse deleteFile(String filePath, Boolean localDelete) {
return response;
}

@Override
@PermissionName(PermissionNameProvider.Name.FILESTORE_WRITE_PERM)
public SolrJerseyResponse executeFileStoreCommand(String path, String getFrom, Boolean sync) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aren't we trying to get rid of the "command" pattern in our new APIs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're definitely trying to avoid it where-ever possible. But it's not always possible unfortunately. We have a documented convention for handling these (hopefully rare) cases, which is pretty close to what this PR does.

Open to coming up for a different pattern for these cases if you have a suggestion for what they might look like?

} catch (IOException e) {
throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, "Error getting file ", e);
}
ClusterFileStore.syncToAllNodes(fileStore, path);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been a bit confused between NodeFileStore, ClusterFileStore, DistribFileStore ;-).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the name conflicts are unfortunate.

On the bright side - I think we can probably make "NodeFileStore" and "NodeFileStoreApi" go away as part of this PR. (Because with these APIs changes, all of our filestore endpoints will be under the "/cluster/filestore" path).

That would just leave:

  • ClusterFileStoreApi - annotated interface defining our filestore APIs
  • ClusterFileStore - implementations of those APIs
  • DistribFileStore - internal implementation of various filestore operations (syncing, deleting, creating, etc.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, the most recent iteration removes NodeFileStore/NodeFileStoreApi. There's still some room for confusion between ClusterFileStoreApi, ClusterFileStore, and DistribFileStore, but overall this PR leaves things better than it found it in that regard 👍

.collect(Collectors.toList());
directoryListingResponse.files = Collections.singletonMap(path, directoryContents);
return directoryListingResponse;
if (type == FileStore.FileType.NOFILE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this feels overly finicky logic.. why is it so hard to decide when to get metadata? What are we missing that doens't just make this simple. A thing has a meta thing. That should be the only way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I think the reason this is finicky is that we cram a bunch of different cases into the "metadata" POJO. It's trying to represent too much. It gets used not just for file-metadata. but also directory listings, representing 404s, etc.

A good refactor IMO would to:

  1. create a class DirectoryMetadata extends Metadata, to use in the directory case.
  2. Have the 404 case throw new SolrException(ErrorCode.NOT_FOUND, "...");

That said - this code all pre-exists this PR, and I'm leery to wade into refactors for scope reasons.

throw new SolrException(
SolrException.ErrorCode.SERVER_ERROR, "Error reading file " + pathCopy);
// User wants to get the "raw" file
// TODO Should we be trying to json-ify otherwise "raw" files in this way? It seems like a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the use case FOR this? Do we have one right now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that I know of. I'm going to nuke it based on the discussion here, and that might flush out something I don't understand.

(We'll need to make sure we're only deprecating it in the 9.x backport, but that's not a big deal...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no discussion of it either in the PR or the JIRA that introduced it (unless I'm missing something?). My guess is that it was an attempt at convenience?

In any case; I think this entire file can now go. See my comment above for more details.

@@ -173,7 +173,7 @@ public void uninstall(String packageName, String version)
String.format(Locale.ROOT, "/package/%s/%s/%s", packageName, version, "manifest.json"));
for (String filePath : filesToDelete) {
DistribFileStore.deleteZKFileEntry(zkClient, filePath);
String path = "/api/cluster/files" + filePath;
String path = "/api/cluster/filestore/files" + filePath;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i like the more verbose path I think... Can a filestore have anything under than files? If not, what about just "/api/cluster/filestore"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can a filestore have anything under than files?

It can: "metadata" and "commands" are both under the "filestore" path as siblings to "files".

@@ -115,7 +115,7 @@ openssl dgst -sha1 -sign my_key.pem runtimelibs.jar | openssl enc -base64 | sed
+
[source, bash]
----
curl --data-binary @runtimelibs.jar -X PUT http://localhost:8983/api/cluster/files/mypkg/1.0/myplugins.jar?sig=<signature-of-jar>
curl --data-binary @runtimelibs.jar -X PUT http://localhost:8983/api/cluster/filestore/files/mypkg/1.0/myplugins.jar?sig=<signature-of-jar>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe drops the files?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(See other comments - "files" is our way to distinguish between "metadata" and "commands", which are both under the "/filestore" path.)

ByteBuffer filedata = null;
try {
final var fileRequest = new FileStoreApi.GetFile(path);
final var client = coreContainer.getSolrClientCache().getHttpSolrClient(baseUrl);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't use the SolrClientCache for anything other than streaming expressions. I'm stamping out such usages here: https://issues.apache.org/jira/browse/SOLR-17630
I'm particularly surprised to see you removed a use of the new requestWithBaseUrl.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted to requestWithBaseUrl.

I'm a little confused about SolrClientCache going away, or only being suitable for streaming expressions though. (Perhaps I'm forgetting some context I previously had on this?) Anyway, I followed up with some questions on SOLR-17630 and we can continue that aspect of discussion there...

Copy link
Contributor Author

@gerlowskija gerlowskija left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replies to some review comments. Still working my way through review feedback.

@POST
@Operation(
summary =
"Pushes a file to other nodes, or pulls a file from other nodes in the Solr cluster.",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This ends up being moot since I take your "split up the endpoint" suggestion on L116)

String getFrom,
@Parameter(
description =
"If true, triggers syncing for this file across all nodes in the filestore")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, I guess splitting these into separate endpoints is a little more in keeping with the currently documented convention. Done 👍

do we somehow eliminate the need for this?

In terms of eliminating the need entirely: I'm open to that if you have ideas? As I understand the current filestore impl, these two commands are essentially internal APIs that Solr relies on to ensure filestore entries are (eventually) present on all nodes. So we'd need some other way of doing that?

Relatedly, it'd be really nice if we had some way to flag the "internal-ness" of these APIs. We want the code-generation to cover them, so that solr-core itself has nice classes to use. But ideally something could signal to end-users that they should never be using these APIs themselves. A different package name? Some sort of Javadoc tag? Maybe that's worth its own JIRA ticket to spur some brainstorming...

@@ -194,6 +301,48 @@ public SolrJerseyResponse deleteFile(String filePath, Boolean localDelete) {
return response;
}

@Override
@PermissionName(PermissionNameProvider.Name.FILESTORE_WRITE_PERM)
public SolrJerseyResponse executeFileStoreCommand(String path, String getFrom, Boolean sync) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're definitely trying to avoid it where-ever possible. But it's not always possible unfortunately. We have a documented convention for handling these (hopefully rare) cases, which is pretty close to what this PR does.

Open to coming up for a different pattern for these cases if you have a suggestion for what they might look like?

ByteBuffer filedata = null;
try {
final var fileRequest = new FileStoreApi.GetFile(path);
final var client = coreContainer.getSolrClientCache().getHttpSolrClient(baseUrl);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted to requestWithBaseUrl.

I'm a little confused about SolrClientCache going away, or only being suitable for streaming expressions though. (Perhaps I'm forgetting some context I previously had on this?) Anyway, I followed up with some questions on SOLR-17630 and we can continue that aspect of discussion there...

@@ -382,16 +380,14 @@ private void distribute(FileInfo info) {
// trying to avoid the thundering herd problem when there are a very large no:of nodes
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -507,7 +503,8 @@ public void delete(String path) {

final var solrParams = new ModifiableSolrParams();
solrParams.add("localDelete", "true");
final var solrRequest = new GenericSolrRequest(DELETE, "/cluster/files" + path, solrParams);
final var solrRequest =
new GenericSolrRequest(DELETE, "/cluster/filestore/files" + path, solrParams);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh - I wonder why I didn't change this at the time? Anyway, fixed.

} catch (IOException e) {
throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, "Error getting file ", e);
}
ClusterFileStore.syncToAllNodes(fileStore, path);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the name conflicts are unfortunate.

On the bright side - I think we can probably make "NodeFileStore" and "NodeFileStoreApi" go away as part of this PR. (Because with these APIs changes, all of our filestore endpoints will be under the "/cluster/filestore" path).

That would just leave:

  • ClusterFileStoreApi - annotated interface defining our filestore APIs
  • ClusterFileStore - implementations of those APIs
  • DistribFileStore - internal implementation of various filestore operations (syncing, deleting, creating, etc.)

.collect(Collectors.toList());
directoryListingResponse.files = Collections.singletonMap(path, directoryContents);
return directoryListingResponse;
if (type == FileStore.FileType.NOFILE
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I think the reason this is finicky is that we cram a bunch of different cases into the "metadata" POJO. It's trying to represent too much. It gets used not just for file-metadata. but also directory listings, representing 404s, etc.

A good refactor IMO would to:

  1. create a class DirectoryMetadata extends Metadata, to use in the directory case.
  2. Have the 404 case throw new SolrException(ErrorCode.NOT_FOUND, "...");

That said - this code all pre-exists this PR, and I'm leery to wade into refactors for scope reasons.

@epugh
Copy link
Contributor

epugh commented Jan 26, 2025

I enjoyed catching up on the comments on this PR!

@gerlowskija
Copy link
Contributor Author

Alright - I think I've addressed the feedback so far? If I've missed anything, let me know. I've brought it up to date with 'main' and will aim to merge in the next few days pending any objections?

@gerlowskija gerlowskija merged commit 2ab1243 into apache:main Feb 14, 2025
4 checks passed
@gerlowskija gerlowskija deleted the SOLR-17351-break-up-getfile-api-main branch February 14, 2025 16:55
gerlowskija added a commit that referenced this pull request Feb 14, 2025
This reverts commit 2ab1243.

Hoss reported a test failure due to some ObjectReleaseTracker
violations.  The InputStream handling did evolve a bit in the course of
review, so its possible that caused some issue.  Undoing this change
while I investigate...
gerlowskija added a commit that referenced this pull request Feb 16, 2025
This PR splits up the "get file" endpoint into a number of different APIs.
Specifically:

  - metadata-fetching has been moved out to the endpoint,
    GET/api/cluster/filestore/metadata/some/path.txt
  - Filestore commands such as pushing/pulling files are now available at:
    POST /api/cluster/filestore/commands
  - Support for "JSON-ified" file data has been dropped in this PR (but will be
    retained but deprecated in the eventual 9.x backport)

These divisions allow us to generate SolrRequest/SolrResponse classes
representing these APIs, meaning that SolrJ users no longer need to use
GenericSolrRequest/GenericSolrResponse.

(This commit apes an earlier commit which offered similar functionality
but caused a few test failures.  These have now been fixed.)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants