Skip to content

Commit

Permalink
bytewords: replace match with precomputed perfect has function
Browse files Browse the repository at this point in the history
The two giant match statements with 256 arms incur linear
lookup cost. To avoid this, we have defined a simple family
of hash functions linear in the first and last character of
the encoded string (this lines up nicely with the fact that
those two characters are identical for the default and minimal
encodings). Exhaustive search has then yielded a perfect
hash function (with no collisions) that only needs a domain
of roughly double the theoretical minimum (which would be
256 elements).

Other families of hash functions might theoretically yield
smaller domains, but we don't judge this to be a relevant
optimization for now.

This design achieves the following goals:

 - Constant lookup time at runtime.

 - `no_std` compliance, i.e. no `OnceLock`s and no
   `HashMap`s.

 - No `unsafe` code.

The recently introduced decoding test [1] suggests performance
improvements of 66%.

[1] bbf09e4
  • Loading branch information
dspicher committed Feb 6, 2025
1 parent bbf09e4 commit c7fd1ae
Show file tree
Hide file tree
Showing 2 changed files with 94 additions and 532 deletions.
Loading

0 comments on commit c7fd1ae

Please sign in to comment.