Skip to content

Commit 33b8b49

Browse files
slevithanantfu
andauthored
feat: Use Oniguruma-To-ES in the JS engine (#828) (#832)
Co-authored-by: Anthony Fu <github@antfu.me>
1 parent 94cc6d8 commit 33b8b49

19 files changed

+252
-462
lines changed

.github/workflows/ci.yml

+2
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,8 @@ jobs:
6565
node: [lts/*]
6666
os: [ubuntu-latest, windows-latest, macos-latest]
6767
include:
68+
- node: 20.x
69+
os: ubuntu-latest
6870
- node: 18.x
6971
os: ubuntu-latest
7072
fail-fast: false

docs/guide/regex-engines.md

+21-7
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ outline: deep
44

55
# RegExp Engines
66

7-
TextMate grammars is based on regular expressions to match tokens. Usually, we use [Oniguruma](https://github.com/kkos/oniguruma) (a regular expression engine written in C) to parse the grammar. To make it work in JavaScript, we compile Oniguruma to WebAssembly to run in the browser or Node.js.
7+
TextMate grammars are based on regular expressions to match tokens. Usually, we use [Oniguruma](https://github.com/kkos/oniguruma) (a regular expression engine written in C) to parse the grammar. To make it work in JavaScript, we compile Oniguruma to WebAssembly to run in the browser or Node.js.
88

99
Since v1.15, we expose the ability to for users to switch the RegExp engine and provide custom implementations.
1010

@@ -20,7 +20,7 @@ const shiki = await createShiki({
2020
})
2121
```
2222

23-
Shiki come with two built-in engines:
23+
Shiki comes with two built-in engines:
2424

2525
## Oniguruma Engine
2626

@@ -43,7 +43,7 @@ const shiki = await createShiki({
4343
This feature is experimental and may change without following semver.
4444
:::
4545

46-
This experimental engine uses JavaScript's native RegExp. As TextMate grammars' regular expressions are in Oniguruma flavor that might contains syntaxes that are not supported by JavaScript's RegExp, we use [`oniguruma-to-js`](https://github.com/antfu/oniguruma-to-js) to lowering the syntaxes and try to make them compatible with JavaScript's RegExp.
46+
This engine uses JavaScript's native RegExp. As regular expressions used by TextMate grammars are written for Oniguruma, they might contain syntax that is not supported by JavaScript's RegExp, or expect different behavior for the same syntax. So we use [Oniguruma-To-ES](https://github.com/slevithan/oniguruma-to-es) to transpile Oniguruma patterns to native JavaScript RegExp.
4747

4848
```ts {2,4,9}
4949
import { createHighlighter } from 'shiki'
@@ -60,17 +60,31 @@ const shiki = await createHighlighter({
6060
const html = shiki.codeToHtml('const a = 1', { lang: 'javascript', theme: 'nord' })
6161
```
6262

63-
Please check the [compatibility table](/references/engine-js-compat) to check the support status of the languages you are using.
63+
Please check the [compatibility table](/references/engine-js-compat) for the support status of the languages you are using.
6464

65-
If mismatches are acceptable and you want it to get results whatever it can, you can enable the `forgiving` option to suppress any errors happened during the conversion:
65+
Unlike the Oniguruma engine, the JavaScript engine is strict by default. It will throw an error if it encounters a pattern that it cannot convert. If mismatches are acceptable and you want best-effort results whenever possible, you can enable the `forgiving` option to suppress any errors that happened during the conversion:
6666

6767
```ts
6868
const jsEngine = createJavaScriptRegexEngine({ forgiving: true })
6969
// ...use the engine
7070
```
7171

7272
::: info
73-
If you runs Shiki on Node.js (or at build time), we still recommend using the Oniguruma engine for the best result, as most of the time bundle size or WebAssembly support is not a concern.
73+
If you run Shiki on Node.js (or at build time) and bundle size or WebAssembly support is not a concern, we still recommend using the Oniguruma engine for the best result.
7474

75-
The JavaScript engine is more suitable for running in the browser in some cases that you want to control the bundle size.
75+
The JavaScript engine is best when running in the browser and in cases when you want to control the bundle size.
7676
:::
77+
78+
### JavaScript Runtime Target
79+
80+
For the most accurate result, [Oniguruma-To-ES](https://github.com/slevithan/oniguruma-to-es) requires the [RegExp `v` flag support](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicodeSets), which is available in Node.js v20+ and ES2024 ([Browser compatibility](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicodeSets#browser_compatibility)).
81+
82+
For older environments, it can simulate the behavior but `u` flag but might yield less accurate results.
83+
84+
By default, it automatically detects the runtime target and uses the appropriate behavior. You can override this behavior by setting the `target` option:
85+
86+
```ts
87+
const jsEngine = createJavaScriptRegexEngine({
88+
target: 'ES2018', // or 'ES2024', default is 'auto'
89+
})
90+
```

0 commit comments

Comments
 (0)