Skip to content

Commit a14745a

Browse files
committed
base38: tweak note re 4/2 and 4/4/2 notation
1 parent aeb35f0 commit a14745a

File tree

1 file changed

+17
-11
lines changed

1 file changed

+17
-11
lines changed

doc/note/base38-and-fourcc.md

+17-11
Original file line numberDiff line numberDiff line change
@@ -70,8 +70,9 @@ the same library). The conventional `uint32_t` packing is:
7070
- Bits `0 ..= 9` (10 bits) are the local enumeration value.
7171

7272
For example:
73-
- [Quirk values](/doc/note/quirks.md) use this `((namespace << 10) | local)`
74-
scheme.
73+
74+
- [Quirk keys](/doc/note/quirks.md), as a `uint32_t`, use this
75+
`((namespace << 10) | local)` scheme.
7576
- [Tokens](/doc/note/tokens.md) assign 21 out of 64 bits for a base38 value.
7677

7778

@@ -82,17 +83,22 @@ unused). 63 bits can therefore hold a 12-character or three 4-character strings
8283
(taken from base38's limited alphabet).
8384

8485
For example, in a custom RPC protocol, the namespace/class/method name could be
85-
base38-encoded as a 4/4/4 string like `"net./conn/ping"`. As a number, this
86-
would be `((0x147150 << 42) | (0x0B7324 << 21) | 0x1633BD)` which is
87-
`0x51C5416E_649633BD`. At the wire format level, this would occupy a
88-
fixed-length (8 bytes) and that 64th bit could, for example, indicate request
89-
or response.
86+
base38-encoded as a 4/4/4 string like `"net./conn/ping"`. As a number, the
87+
4/4/4 format (instead of a monolithic 12) means that each 4-character fragment
88+
is base-38 encoded independently and the three 21-bit numbers are then combined
89+
(with bitshifting). `"net./conn/ping"` would be `((0x147150 << 42) | (0x0B7324
90+
<< 21) | 0x1633BD)` which is `0x51C5416E_649633BD`. At the wire format level,
91+
this would occupy a fixed-length (8 bytes) and that 64th bit could, for
92+
example, indicate request or response.
9093

9194
A 2-character string can fit in 11 bits, as `38 ** 2 = 0x5A4 = 1444` is smaller
92-
than `2 ** 11 = 0x800 = 2048`. 53 bits can therefore hold a 10-character or
93-
4/4/2 alpha-numeric-ish string. 53 bits also fits snugly under JavaScript's
94-
`Number.MAX_SAFE_INTEGER` - these integers can be losslessly stored in a
95-
`double` or `float64_t`.
95+
than `2 ** 11 = 0x800 = 2048`. Therefore:
96+
97+
- 32 bits (21 + 11) can hold a 6-character (as 4/2) alpha-numeric-ish string.
98+
32 bits obviously fits snugly in a `uint32_t`.
99+
- 53 bits (21 + 21 + 11) can hold a 10-character (as 4/4/2) alpha-numeric-ish
100+
string. 53 bits fits snugly under JavaScript's `Number.MAX_SAFE_INTEGER` so
101+
these integers can be losslessly stored in a `double` or a JSON value.
96102

97103
[Enumerated Media Types](./enumerated-media-types.txt) uses this base38 4/4/2
98104
encoding, mapping `"image/jpeg"` to the base38 `"imag/jpeg/.."` which is

0 commit comments

Comments
 (0)