-
Notifications
You must be signed in to change notification settings - Fork 73
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
docs: include Algolia DocSearch ADR and crawler config (#2991)
- Loading branch information
1 parent
6927788
commit 35bfa7d
Showing
3 changed files
with
175 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
# Paragon | Algolia DocSearch | ||
|
||
This module contains the Algolia DocSearch crawler configuration (i.e., ``crawler-config.js``) that controls the behavior of the Algolia site crawler responsible for indexing content from the Paragon documentation website to the Algolia index. | ||
|
||
Any revisions to the configuration in this module must also be persisted upstream in the Algolia DocSearch crawler configuration. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,136 @@ | ||
/* eslint-disable */ | ||
|
||
// README: When updating the Algolia DocSearch crawler configuration here, it will also need to be updated | ||
// in the Algolia DocSearch crawler editor (https://crawler.algolia.com/). Otherwise, changes to this persisted | ||
// configuration will not actually apply to the Paragon documentation website as intended. | ||
|
||
// Note: there are REDACTED Algolia `appId` and `apiKey` values below; these should not be committed to the repository | ||
// but should be included in the crawler configuration in the Algolia DocSearch crawler editor. | ||
|
||
new Crawler({ | ||
rateLimit: 8, | ||
startUrls: ["https://paragon-openedx.netlify.app/"], | ||
renderJavaScript: false, | ||
sitemaps: [], | ||
ignoreCanonicalTo: false, | ||
discoveryPatterns: ["https://paragon-openedx.netlify.app/**"], | ||
schedule: "every 1 day", | ||
actions: [ | ||
{ | ||
indexName: "paragon-openedx", | ||
pathsToMatch: [ | ||
"https://paragon-openedx.netlify.app/**", | ||
"!https://paragon-openedx.netlify.app/insights/", | ||
"!https://paragon-openedx.netlify.app/status/", | ||
"!https://paragon-openedx.netlify.app/changelog/", | ||
], | ||
recordExtractor: ({ helpers, url, $ }) => { | ||
const category = url.pathname.split("/")[1] || "Documentation"; | ||
return helpers.docsearch({ | ||
recordProps: { | ||
// lvl1: ["header h1", "article h1", "main h1", "h1", "head > title"], | ||
lvl1: ["main h1"], | ||
lvl0: { | ||
selectors: "", | ||
defaultValue: | ||
category.charAt(0).toUpperCase() + category.slice(1), | ||
}, | ||
lvl2: ["main h2"], | ||
lvl3: ["article h3", "main h3", "h3"], | ||
lvl4: ["article h4", "main h4", "h4"], | ||
lvl5: ["article h5", "main h5", "h5"], | ||
lvl6: ["article h6", "main h6", "h6"], | ||
content: ["article p, article li", "main p, main li", "p, li"], | ||
}, | ||
aggregateContent: true, | ||
}); | ||
}, | ||
}, | ||
], | ||
initialIndexSettings: { | ||
"paragon-openedx": { | ||
attributesForFaceting: ["type", "lang"], | ||
attributesToRetrieve: [ | ||
"hierarchy", | ||
"content", | ||
"anchor", | ||
"url", | ||
"url_without_anchor", | ||
"type", | ||
], | ||
attributesToHighlight: ["hierarchy", "hierarchy_camel", "content"], | ||
attributesToSnippet: ["content:10"], | ||
camelCaseAttributes: ["hierarchy", "hierarchy_radio", "content"], | ||
searchableAttributes: [ | ||
"unordered(hierarchy_radio_camel.lvl0)", | ||
"unordered(hierarchy_radio.lvl0)", | ||
"unordered(hierarchy_radio_camel.lvl1)", | ||
"unordered(hierarchy_radio.lvl1)", | ||
"unordered(hierarchy_radio_camel.lvl2)", | ||
"unordered(hierarchy_radio.lvl2)", | ||
"unordered(hierarchy_radio_camel.lvl3)", | ||
"unordered(hierarchy_radio.lvl3)", | ||
"unordered(hierarchy_radio_camel.lvl4)", | ||
"unordered(hierarchy_radio.lvl4)", | ||
"unordered(hierarchy_radio_camel.lvl5)", | ||
"unordered(hierarchy_radio.lvl5)", | ||
"unordered(hierarchy_radio_camel.lvl6)", | ||
"unordered(hierarchy_radio.lvl6)", | ||
"unordered(hierarchy_camel.lvl0)", | ||
"unordered(hierarchy.lvl0)", | ||
"unordered(hierarchy_camel.lvl1)", | ||
"unordered(hierarchy.lvl1)", | ||
"unordered(hierarchy_camel.lvl2)", | ||
"unordered(hierarchy.lvl2)", | ||
"unordered(hierarchy_camel.lvl3)", | ||
"unordered(hierarchy.lvl3)", | ||
"unordered(hierarchy_camel.lvl4)", | ||
"unordered(hierarchy.lvl4)", | ||
"unordered(hierarchy_camel.lvl5)", | ||
"unordered(hierarchy.lvl5)", | ||
"unordered(hierarchy_camel.lvl6)", | ||
"unordered(hierarchy.lvl6)", | ||
"content", | ||
], | ||
distinct: true, | ||
attributeForDistinct: "url", | ||
customRanking: [ | ||
"desc(weight.pageRank)", | ||
"desc(weight.level)", | ||
"asc(weight.position)", | ||
], | ||
ranking: [ | ||
"words", | ||
"filters", | ||
"typo", | ||
"attribute", | ||
"proximity", | ||
"exact", | ||
"custom", | ||
], | ||
highlightPreTag: '<span class="algolia-docsearch-suggestion--highlight">', | ||
highlightPostTag: "</span>", | ||
minWordSizefor1Typo: 3, | ||
minWordSizefor2Typos: 7, | ||
allowTyposOnNumericTokens: false, | ||
minProximity: 1, | ||
ignorePlurals: true, | ||
advancedSyntax: true, | ||
attributeCriteriaComputedByMinProximity: true, | ||
removeWordsIfNoResults: "allOptional", | ||
}, | ||
}, | ||
appId: "", // REDACTED | ||
apiKey: "", // REDACTED | ||
extraUrls: [ | ||
"https://paragon-openedx.netlify.app/foundations/colors", | ||
"https://paragon-openedx.netlify.app/foundations/elevation", | ||
"https://paragon-openedx.netlify.app/foundations/typography", | ||
"https://paragon-openedx.netlify.app/foundations/css-utilities", | ||
"https://paragon-openedx.netlify.app/foundations/responsive", | ||
"https://paragon-openedx.netlify.app/foundations/brand-icons", | ||
"https://paragon-openedx.netlify.app/guides/installation-and-usage", | ||
"https://paragon-openedx.netlify.app/tools/component-generator", | ||
"https://paragon-openedx.netlify.app/playground", | ||
], | ||
}); |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
20. Adopting and maintaining Algolia DocSearch | ||
---------------------------------------------- | ||
|
||
Status | ||
------ | ||
|
||
Accepted | ||
|
||
Context | ||
------- | ||
|
||
The Paragon documentation website (https://paragon-openedx.netlify.app/), hosted on Netlify, is used by designers and engineers to understand and use the capabilities provided by the Paragon design system and React component library. Ensuring consumers of Paragon can efficiently find the content they need is critical to ensuring the Paragon design system is easy to use and adopt. | ||
|
||
Without formally supporting search, Paragon consumers generally need to use native browser search capabilities (e.g., `Cmd + F`). To make content discoverability easier, we would like to support search functionality on the Paragon documentation website in a lightweight and low-maintenance way. | ||
|
||
Decision | ||
-------- | ||
|
||
We will adopt Algolia DocSearch (https://docsearch.algolia.com/), a free tool provided to open-source projects who have documentation needs. Algolia DocSearch provides a configurable crawler that parses the Paragon documentation website on a regular frequency and indexes the content for search in an Algolia index. Algolia DocSearch also provides a search UI widget that can be embedded in the Paragon documentation website to provide search functionality. | ||
|
||
While the Algolia DocSearch crawler is configurable, its code is not open-source given it needs to be applied within the Algolia DocSearch crawler itself. To mitigate this, we will persist the code of the crawler configuration in the Paragon repository so that it can be easily referenced and updated as needed. When the crawler configuration is updated in the Paragon repository, it should also be updated in the Algolia DocSearch crawler editor. | ||
|
||
Consequences | ||
------------ | ||
|
||
* By persisting the Algolia DocSearch crawler configuration in the Paragon repository, we may run into situations where the crawler configuration is updated in the Paragon repository without it also getting persisted in the Algolia DocSearch crawler editor itself. This could result in the crawler configuration in the Algolia DocSearch crawler editor being out of sync with the Paragon repository. To mitigate this, we will document the process for updating the crawler configuration in the Paragon repository and the Algolia DocSearch crawler editor. | ||
|
||
Resources | ||
--------- | ||
|
||
* https://www.algolia.com/ | ||
* https://docsearch.algolia.com/ | ||
* https://crawler.algolia.com/ | ||
* https://dashboard.algolia.com/ |