Index Welsh "HMRC Contact" specialist documents #3226
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
We're migrating the HMRC Contacts Admin over to a general "HMRC Finder" Specialist Finder. HMRC need to be able to publish both English and Welsh specialist documents (of type
hmrc_contact
) to serve user needs.Whilst the specialist document itself renders well with a locale of
cy
(Welsh page furniture and correctlang
attribute set on the<main>
HTML element), Welsh specialist documents aren't currently surfaced in the HMRC Finder because Search API rejects all non-English documents at the point of indexing. See related PR here:#1810
This means that the "Cymraeg / Welsh" facet we've added to the HMRC Finder (for feature parity with the Contacts Admin) doesn't work: it always returns 0 documents.
alphagov/specialist-publisher#3011
Non-English documents were omitted from Search API because:
That is no longer much of a concern in 2025, because site search uses search-api-v2, which is unaffected by this change. That being said, site search falls back to search-api (v1) if no query param is provided, so anyone looking at https://www.gov.uk/search/all?order=updated-newest could see lots of non-English content surfacing if we're not careful about how we make this change.
We're therefore scoping this change to just Welsh, and just HMRC Contact specialist document type. There will only be a few of them, so the chances of these being unexpectedly surfaced outside of the HMRC Finder itself are very slim.
A better solution would be to start indexing the locale of all documents, and then applying a locale filter everywhere in our frontend apps that makes calls to Search API. But that's a pretty sizeable change, especially considering that we generally want to be moving to Search API v2. So this was considered a suitable stop-gap in the meantime.
We've logged the wider issue as a publishing tech debt card to revisit later: https://trello.com/c/ZzszTweH/
Trello for this work: https://trello.com/c/fXxLTdGk/