Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DISCO-3211] Suggest: move to new remote settings API #6598

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

tiftran
Copy link
Contributor

@tiftran tiftran commented Feb 19, 2025

Pull Request checklist

  • Breaking changes: This PR follows our breaking change policy
    • This PR follows the breaking change policy:
      • This PR has no breaking API changes, or
      • There are corresponding PRs for our consumer applications that resolve the breaking changes and have been approved
  • Quality: This PR builds and tests run cleanly
    • Note:
      • For changes that need extra cross-platform testing, consider adding [ci full] to the PR title.
      • If this pull request includes a breaking change, consider cutting a new release after merging.
  • Tests: This PR includes thorough tests or an explanation of why it does not
  • Changelog: This PR includes a changelog entry in CHANGELOG.md or an explanation of why it does not need one
    • Any breaking changes to Swift or Kotlin binding APIs are noted explicitly
  • Dependencies: This PR follows our dependency management guidelines
    • Any new dependencies are accompanied by a summary of the due diligence applied in selecting them.

Branch builds: add [firefox-android: branch-name] to the PR title.

@tiftran tiftran requested a review from a team February 19, 2025 07:46
@@ -181,11 +157,23 @@ impl Record {
}
}

impl From<Record> for RemoteSettingsRecord {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cargo clippy said impl From is preferred over Into

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, Rust will automatically generate the Into if you define a From.

@@ -214,6 +202,15 @@ pub(crate) enum SuggestRecord {
Geonames,
}

impl SuggestRecord {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels a little goofy just so that we can use Record to get_attachment, but setting fields to Map::new( ) also felt weird

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, slightly goofy but this makes sense to me.

Copy link
Contributor

@bendk bendk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great. I had some minor suggestions, but the only real issue is how we land this. I'm marking this Request Changes just because I think we should land that preparatory PR I mentioned to make dealing with the breaking changes easier.

@@ -98,19 +98,14 @@ impl SuggestStoreBuilder {
}

#[handle_error(Error)]
pub fn build(&self) -> SuggestApiResult<Arc<SuggestStore>> {
pub fn build(&self, rs_service: &RemoteSettingsService) -> SuggestApiResult<Arc<SuggestStore>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that we haven't talked about is that this is going to be a breaking change for all applications, and it might not be so simple to fix. I'm not sure they all have started using the new remote settings API and have a RemoteSettingsService ready to give us.

To make the transition easy, could you open a second PR that changes this signature to something like this:

#[uniffi::method(default(rs_service=None))]
pub fn build(&self, rs_service: Option<Arc<RemoteSettingsService>>)

That PR doesn't need to do actually do anything with rs_service. The point is to give us a decent way to merge all the changes:

  • Merge the above, non-breaking, change.
  • Get all consumers to start passing in the rs_service
  • Merge your actual PR, which technically will be a breaking change since rs_service is no longer optional, but it shouldn't be much work to fix the consumer apps. Maybe they won't need any changes at all.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remote_settings_server and remote_settings_bucket_name can be removed from SuggestStoreBuilder too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually why not make rs_service a method on SuggestStoreBuilder instead of an arg to build?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bdk added a remote_settings_service() method in #6608 yesterday so we should use that to set the service instead of adding a new param to build().

std::fs::copy(starter_db_path, &db_path).expect("Error copying starter DB file");
SuggestStore::new(&db_path.to_string_lossy(), None).expect("Error building store")
SuggestStore::new(&db_path.to_string_lossy(), remote_settings_service)
.expect("Error building store")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Thanks for keeping this code up to date.

})
}

fn client_for_collection(&self, collection: Collection) -> &remote_settings::RemoteSettings {
fn client_for_collection(&self, collection: Collection) -> &Arc<RemoteSettingsClient> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: this could just be a &RemoteSettingsClient. The Arc makes me think it might be cloned to create a new reference, but I don't think you ever do that.

};
if last_modified != response.last_modified {
dao.write_cached_rs_data(collection.name(), &response);
let _ = client.sync();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be client.sync()?, that way errors get propagated up.

@@ -181,11 +157,23 @@ impl Record {
}
}

impl From<Record> for RemoteSettingsRecord {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, Rust will automatically generate the Into if you define a From.

@@ -214,6 +202,15 @@ pub(crate) enum SuggestRecord {
Geonames,
}

impl SuggestRecord {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, slightly goofy but this makes sense to me.

fn to_json_map(&self) -> Map<String, Value> {
match serde_json::to_value(self) {
Ok(Value::Object(map)) => map,
_ => Map::new(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this branch could be unreachable!(). I think we can safely say that serde will always be able to to deserialize to Value::Object.

@tiftran
Copy link
Contributor Author

tiftran commented Feb 19, 2025

I see that there are a couple of PRs that will slightly change this one, I will wait and see when those merge and change accordingly

@tiftran
Copy link
Contributor Author

tiftran commented Feb 26, 2025

I see that the other PRs are merged, so i rebased and made changes 🙂

Copy link
Contributor

@0c0w3 0c0w3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC we no longer need our RS cache in suggest, right? And that's why this removes the RS-cache-related methods from SuggestDao. It looks like rs_cache is still referenced in schema.rs, so we'll also need to:

  • remove those references
  • add a schema migration that removes the rs_cache table

I think we need to do this before vendoring, so either in this PR or in a fast follow-up.

Comment on lines 105 to 106
// For now, handle the cache manually. Once 6328 is merged, we should be able to delegate
// this to remote_settings.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment can be removed.

match &record.attachment {
Some(a) => Ok(self
Some(_a) => Ok(self
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Some(_)

@@ -98,19 +98,14 @@ impl SuggestStoreBuilder {
}

#[handle_error(Error)]
pub fn build(&self) -> SuggestApiResult<Arc<SuggestStore>> {
pub fn build(&self, rs_service: &RemoteSettingsService) -> SuggestApiResult<Arc<SuggestStore>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bdk added a remote_settings_service() method in #6608 yesterday so we should use that to set the service instead of adding a new param to build().

}

fn download_attachment(&self, record: &Record) -> Result<Vec<u8>> {
fn download_attachment(&self, record: Record) -> Result<Vec<u8>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can remain &Record AFAICT, no need to pass by value

Copy link
Contributor Author

@tiftran tiftran Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh i ran into trouble yesterday and this is my rust newbie solution. get_attachment expects a &RemoteSettingRecord and .into() wasn't helping. Since we're suggesting removing the struct below in a follow up, this would resolve itself?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you add back the & in both places in rs.rs and then fix the errors in the other files, it should work (it does for me). I do think it's worth fixing so we're not unnecessarily creating clones.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let me try that again

@@ -181,11 +157,23 @@ impl Record {
}
}

impl From<Record> for RemoteSettingsRecord {
fn from(record: Record) -> Self {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we get a bug/ticket on file for removing our Record struct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense it's more or less the same as RemoteSettingsRecord

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key difference is the payload field. RemoteSettingsRecord stores a JSON value there, since it doesn't know the specifics of how it's going to be used. suggest::rs::Record stores the deserialized SuggestRecord there. So, if a function inputs suggest::rs::Record it can access the fields more conveniently and it doesn't have to deal with deserialization errors.

It's still a good idea and I'm pretty sure there's a way to remove some code duplication. Maybe RemoteSettingsRecord could be generic over its payload field. It's just not going to be a trivial change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

created a jira in our maintenance epic for this

}))
} else {
Err(Error::RemoteSettings(RemoteSettingsError::Other {
reason: "Unable to create client".to_string(),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im unsure what's the best way to handle this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this makes sense to me and it's similar to how we return an error if data_path is none, right above. You could use that as an example to rewrite this without an if statement:

        let rs_service = inner.remote_settings_service.clone().ok_or_else(|| {
            Error::RemoteSettings(RemoteSettingsError::Other {
                reason: "Unable to create client".to_string(),
            })
        })?;
        Ok(Arc::new(SuggestStore {
            inner: SuggestStoreInner::new(
                data_path,
                extensions_to_load,
                SuggestRemoteSettingsClient::new(&rs_service)?,
            ),
        }))

I'll need to defer to @bendk if all this is kosher or not w/r/t Arc. SuggestRemoteSettingsClient should probably take an Arc<RemoteSettingsService> instead of &RemoteSettingsService, right? Like how SuggestStore does.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message should be something like "remote_settings_service not specified" (similar to the data_path error message)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that that error message would be better. I think everything else is fine.

It's okay that SuggestRemoteStetingsClient::new. inputs &RemoteSettingsService, since it only uses that that to call make_client(), then stores the returned Arc<RemoteSettingsClient> instances. In general, I think Rust is going to keep us honest here and throw up compile errors if we do something wrong.

@tiftran
Copy link
Contributor Author

tiftran commented Feb 26, 2025

okie doke, i think i've addressed the latest feedback

@tiftran tiftran requested a review from 0c0w3 February 26, 2025 20:51
@@ -616,6 +611,12 @@ impl ConnectionInitializer for SuggestConnectionInitializer<'_> {
)?;
Ok(())
}
32 => {
// Drop rs_cache since it's no longer needed.
clear_database(tx)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to clear the DB I don't think?

@@ -819,11 +815,6 @@ PRAGMA user_version=16;
0,
"ingested_records should be empty"
);
assert_eq!(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few code comments in this function (including the function's doc comment) talk about rs_cache so could you update those too please?

Comment on lines 88 to 89
// When #6607 lands, this will set the remote settings service.
// For now, it just exists so we can move consumers over to the new API ahead of time.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment can be removed but I'll defer to @bendk on if/how this patch breaks mobile consumers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this comment can be removed. Assuming we get mobile consumers using the new API from #6607 ahead of time, then there's no breakage.

}))
} else {
Err(Error::RemoteSettings(RemoteSettingsError::Other {
reason: "Unable to create client".to_string(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this makes sense to me and it's similar to how we return an error if data_path is none, right above. You could use that as an example to rewrite this without an if statement:

        let rs_service = inner.remote_settings_service.clone().ok_or_else(|| {
            Error::RemoteSettings(RemoteSettingsError::Other {
                reason: "Unable to create client".to_string(),
            })
        })?;
        Ok(Arc::new(SuggestStore {
            inner: SuggestStoreInner::new(
                data_path,
                extensions_to_load,
                SuggestRemoteSettingsClient::new(&rs_service)?,
            ),
        }))

I'll need to defer to @bendk if all this is kosher or not w/r/t Arc. SuggestRemoteSettingsClient should probably take an Arc<RemoteSettingsService> instead of &RemoteSettingsService, right? Like how SuggestStore does.

@tiftran tiftran requested a review from 0c0w3 February 27, 2025 18:47
Copy link
Contributor

@0c0w3 0c0w3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Tif!

@@ -578,8 +569,7 @@ where
// For each collection, fetch all records
for (collection, record_types) in record_types_by_collection {
breadcrumb!("Ingesting collection {}", collection.name());
let records =
write_scope.write(|dao| self.settings_client.get_records(collection, dao))?;
let records = write_scope.write(|_| self.settings_client.get_records(collection))?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

            let records = self.settings_client.get_records(collection)?;

No need for the write scope anymore since we don't need to write to the DB when getting records. (Sorry for not catching that before)

Comment on lines 844 to 849
let records = writer
.write(|dao| {
.write(|_| {
self.settings_client
.get_records(ingest_record_type.collection(), dao)
.get_records(ingest_record_type.collection())
})
.unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here too:

        let records = self
            .settings_client
            .get_records(ingest_record_type.collection())
            .unwrap();

Copy link
Contributor

@bendk bendk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes look good to me. Let's hold this back and not merge until phabricator.services.mozilla.com/D239603 is merged. I'll be following that one and try to keep you updated on it.

0c0w3 added a commit that referenced this pull request Mar 5, 2025
This is based on #6598

It uses the new remote settings collections and makes some other related
changes. Sorry for how big this is but I think it makes sense to make these
changes together. Details below.

We're setting up two new RS collections for geo expansion: `quicksuggest-amp`
and `quicksuggest-other`. The AMP collection will contain AMP suggestions and
icons for all user segments. The other collection will contain everything else
currently in the `quicksuggest` collection for all user segments. For both
collections, the client will use RS filter expressions to download the right
attachments for its user segment.

In the AMP collection, the record type will be `amp`, not `data` as it currently
is. The AMP collection will be updated often compared to the other one.

After discussion with Nan and Ben, we think it makes sense to move the client
over to the new collections. That's simpler than the alternative of hardcoding
branching collection logic or adding a way for consumers to pass in collection
overrides or something like that. We think it's unlikely we won't go forward
with geo expansion in some capacity, and even if we don't, the new collections
still have some benefits over the current single collection.

With the new collections, a record's collection isn't a function of its type
anymore. AMP icons will be in the AMP collection, but other icons will be in the
other collection. Currently Fakespot icons are in `quicksuggest` but this change
would let us move them to the Fakespot collection if we wanted.

I replaced `SuggestRecordType::collection` with some methods on
`SuggestionProvider`. There's a provider's primary collection and record type
and then some possible secondary collections/types, like icons and geonames.

I added some test helpers to create mock records given a provider so that tests
don't need to make assumptions about collections and record types. Most of the
test changes in `store.rs` are due to that.

Related to the previous point, I simplified `MockRemoteSettingsClient` and
introduced `MockRecord` and `MockAttachment` similar to the existing `MockIcon`.
The test helpers I mentioned in the previous point create mock records, and the
mock client interface is based on those.

I went ahead and replaced the `AmpWikipedia` record and type with separate `Amp`
and `Wikipedia` values. This is a good opportunity to do that, and it simplifies
`rs.rs`.

Mobile and tablet suggestions are just AMP suggestions that will live in
`quicksuggest-amp`, and consumers can ingest them by passing in an appropriate
RS context/filter, so I removed the AMP mobile provider and record type.
0c0w3 added a commit that referenced this pull request Mar 5, 2025
This is based on #6598

It uses the new remote settings collections and makes some other related
changes. Sorry for how big this is but I think it makes sense to make these
changes together. Details below.

We're setting up two new RS collections for geo expansion: `quicksuggest-amp`
and `quicksuggest-other`. The AMP collection will contain AMP suggestions and
icons for all user segments. The other collection will contain everything else
currently in the `quicksuggest` collection for all user segments. For both
collections, the client will use RS filter expressions to download the right
attachments for its user segment.

In the AMP collection, the record type will be `amp`, not `data` as it currently
is. The AMP collection will be updated often compared to the other one.

After discussion with Nan and Ben, we think it makes sense to move the client
over to the new collections. That's simpler than the alternative of hardcoding
branching collection logic or adding a way for consumers to pass in collection
overrides or something like that. We think it's unlikely we won't go forward
with geo expansion in some capacity, and even if we don't, the new collections
still have some benefits over the current single collection.

With the new collections, a record's collection isn't a function of its type
anymore. AMP icons will be in the AMP collection, but other icons will be in the
other collection. Currently Fakespot icons are in `quicksuggest` but this change
would let us move them to the Fakespot collection if we wanted.

I replaced `SuggestRecordType::collection` with some methods on
`SuggestionProvider`. There's a provider's primary collection and record type
and then some possible secondary collections/types, like icons and geonames.

I added some test helpers to create mock records given a provider so that tests
don't need to make assumptions about collections and record types. Most of the
test changes in `store.rs` are due to that.

Related to the previous point, I simplified `MockRemoteSettingsClient` and
introduced `MockRecord` and `MockAttachment` similar to the existing `MockIcon`.
The test helpers I mentioned in the previous point create mock records, and the
mock client interface is based on those.

I went ahead and replaced the `AmpWikipedia` record and type with separate `Amp`
and `Wikipedia` values. This is a good opportunity to do that, and it simplifies
`rs.rs`.

Mobile and tablet suggestions are just AMP suggestions that will live in
`quicksuggest-amp`, and consumers can ingest them by passing in an appropriate
RS context/filter, so I removed the AMP mobile provider and record type.
0c0w3 added a commit that referenced this pull request Mar 7, 2025
This is based on #6598

It uses the new remote settings collections and makes some other related
changes. Sorry for how big this is but I think it makes sense to make these
changes together. Details below.

We're setting up two new RS collections for geo expansion: `quicksuggest-amp`
and `quicksuggest-other`. The AMP collection will contain AMP suggestions and
icons for all user segments. The other collection will contain everything else
currently in the `quicksuggest` collection for all user segments. For both
collections, the client will use RS filter expressions to download the right
attachments for its user segment.

In the AMP collection, the record type will be `amp`, not `data` as it currently
is. The AMP collection will be updated often compared to the other one.

After discussion with Nan and Ben, we think it makes sense to move the client
over to the new collections. That's simpler than the alternative of hardcoding
branching collection logic or adding a way for consumers to pass in collection
overrides or something like that. We think it's unlikely we won't go forward
with geo expansion in some capacity, and even if we don't, the new collections
still have some benefits over the current single collection.

With the new collections, a record's collection isn't a function of its type
anymore. AMP icons will be in the AMP collection, but other icons will be in the
other collection. Currently Fakespot icons are in `quicksuggest` but this change
would let us move them to the Fakespot collection if we wanted.

I replaced `SuggestRecordType::collection` with some methods on
`SuggestionProvider`. There's a provider's primary collection and record type
and then some possible secondary collections/types, like icons and geonames.

I added some test helpers to create mock records given a provider so that tests
don't need to make assumptions about collections and record types. Most of the
test changes in `store.rs` are due to that.

Related to the previous point, I simplified `MockRemoteSettingsClient` and
introduced `MockRecord` and `MockAttachment` similar to the existing `MockIcon`.
The test helpers I mentioned in the previous point create mock records, and the
mock client interface is based on those.

I went ahead and replaced the `AmpWikipedia` record and type with separate `Amp`
and `Wikipedia` values. This is a good opportunity to do that, and it simplifies
`rs.rs`.

Mobile and tablet suggestions are just AMP suggestions that will live in
`quicksuggest-amp`, and consumers can ingest them by passing in an appropriate
RS context/filter, so I removed the AMP mobile provider and record type.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants