v0.0.4 Add bytecase package
v0.0.4
This release adds the bytecase
package, consolidates tests, fixes the
handling of invalid UTF-8 encodings, and fixes CPU feature detection
on AMD64.
With this release the strcase
and bytcase
packages are functionally complete
and sufficiently tested to be considered accurate, if not authoritative.
The only planned changes from here will by to improve the performance of the
strcase
and bytcase
packages and to add functions for ASCII only matching.
commit 40d8345
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Thu Sep 12 00:14:37 2024 -0400
all: add vim modeline to Makefiles
commit e8b7353
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Wed Sep 11 23:26:15 2024 -0400
gh: fixup make targets
commit 38c6b34
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Wed Sep 11 22:41:26 2024 -0400
gh: run exhaustive tests only on the latest version
This commit updates the GH Workflows to only run the exhaustive test
suite on the latest version of Go. Older versions now only run the
faster test suite.
commit b5ec23d
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Wed Sep 11 22:22:45 2024 -0400
scripts/test-benchmarks.bash: format
commit 448bdbf
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Wed Sep 11 22:04:05 2024 -0400
README: add codecov badge
commit 1a92cfb
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Wed Sep 11 21:53:28 2024 -0400
gh: add codecov workflow
commit eb16a13
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Wed Sep 11 21:37:41 2024 -0400
internal/bytealg: fix amd64 cpu feature detection
This changes bytealg to actually use the golang.org/x/sys/cpu package to
detect CPU features, which resolves an issue where POPCNT was not
detected when GOAMD=v1.
commit 0193c0b
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date: Thu Sep 5 07:28:15 2024 +0000
build(deps): bump golang.org/x/sys from 0.16.0 to 0.25.0
Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.16.0 to 0.25.0.
- [Commits](https://github.com/golang/sys/compare/v0.16.0...v0.25.0)
---
updated-dependencies:
- dependency-name: golang.org/x/sys
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
commit 1c9089b
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Tue Sep 10 22:47:16 2024 -0400
all: document bytcase package
commit c3fc3c2
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Tue Sep 10 21:20:35 2024 -0400
all: update golangci-lint to v1.61.0
commit 319787f
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Tue Sep 10 21:19:55 2024 -0400
all: cleanup Makefiles
commit 8de86b8
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Tue Sep 10 20:43:14 2024 -0400
scripts: add go-vet-harder script
commit b823423
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Mon Sep 9 14:34:35 2024 -0400
scripts/test-benchmarks: test benchmarks in parallel
commit e2f3faf
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Sat Sep 7 19:32:01 2024 -0400
all: fix Index for invalid UTF-8-encoded runes
This fixes Index to handle invalid UTF-8 encodings. Previously, not all
invalid UTF-8 sequences were considered equal - this commit fixes that.
commit 333e48a
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Thu Sep 5 15:21:28 2024 -0400
strcase,bytcase: add bytcase package and consolidate tests
commit be291c0
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Wed Aug 28 15:44:24 2024 -0400
internal/gen/gentables: run 'go build/test' on all packages
This is necessary now that we've moved the tables to their own package
and added the bytcase package.
commit 161bd95
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Thu Aug 22 20:05:47 2024 -0400
gh: cleanup workflows
commit eb24309
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Thu Aug 22 19:58:22 2024 -0400
gh: fix test-generate.yml
commit 1cde62c
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Thu Aug 22 12:33:33 2024 -0400
strcase: fix mispelling in fuzz test
commit 3d15daa
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Thu Aug 22 12:12:08 2024 -0400
gh: run tests on 386
commit bc22cb4
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Thu Aug 22 12:01:46 2024 -0400
gh: run tests on macos arm64
Test on arm64 since we have arm64 assembly that needs to be exercised in
tests.
commit fa3952e
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Thu Aug 22 11:46:16 2024 -0400
gh: fix workflows and run golangci-lint on the gen package
commit d18647c
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Thu Aug 22 11:23:01 2024 -0400
internal/gen/gentables: check output of filepath.WalkDir
commit 711ba34
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Thu Aug 22 11:14:16 2024 -0400
gh: run more tests and run them individually
commit 02d0e91
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Thu Aug 22 11:04:55 2024 -0400
strcase/bytcase: don't fold LATIN SMALL LETTER DOTLESS I
Change code to not fold 'ı' (LATIN SMALL LETTER DOTLESS I). The issue
here is that 'ı' and 'İ' have an upper/lower case, but no folds and we
were incorrectly using the upper case form of 'ı' in comparisons.
TODO: change the generation logic to exclude 'ı' and 'İ' from the
upper/lower case table.
commit c794736
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Wed Aug 21 17:23:39 2024 -0400
gh: test on go1.23
commit 166ae75
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Wed Aug 21 15:12:13 2024 -0400
strcase: test that benchmarks pass in CI
commit c20535d
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Wed Aug 21 14:39:15 2024 -0400
strcase: add more Count benchmarks
commit 7e66f9a
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Wed Aug 21 14:38:45 2024 -0400
strcase: fix BenchmarkLastIndexAnyUTF8
commit c5b0bb4
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Wed Aug 21 14:37:56 2024 -0400
strcase: fix BenchmarkIndexRuneTorture_Bytes
commit cafbcd8
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Wed Aug 21 00:43:05 2024 -0400
all: cleanup common.mk
commit 9064b75
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Wed Aug 21 00:42:40 2024 -0400
bytcase: cleanup Rabin-Karp and test UnicodeVersion
commit 2b82978
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Wed Aug 21 00:41:56 2024 -0400
internal/test: don't export CountTests
commit feef8b8
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Tue Aug 20 23:57:31 2024 -0400
strcase/bytcase: consolidate tests
commit 07bfdcd
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Tue Aug 20 23:40:16 2024 -0400
strcase/bytcase: iniitial implementation of bytcase package
TODO: unify tests and add benchmarks
commit cd166f6
Merge: 315f6c3 2340040
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Thu Aug 22 09:36:53 2024 -0400
Merge pull request #9 from charlievieth/dependabot/github_actions/actions/checkout-4
build(deps): bump actions/checkout from 3 to 4
commit 315f6c3
Merge: 16d54c8 4c3bcc3
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Thu Aug 22 09:36:41 2024 -0400
Merge pull request #10 from charlievieth/dependabot/github_actions/github/codeql-action-3
build(deps): bump github/codeql-action from 2 to 3
commit 16d54c8
Merge: fe6357f 90fd147
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Thu Aug 22 09:36:31 2024 -0400
Merge pull request #11 from charlievieth/dependabot/github_actions/actions/setup-go-5
build(deps): bump actions/setup-go from 4 to 5
commit fe6357f
Merge: 0970b2a 40a93fd
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Thu Aug 22 09:36:01 2024 -0400
Merge pull request #12 from charlievieth/dependabot/github_actions/golangci/golangci-lint-action-6
build(deps): bump golangci/golangci-lint-action from 3 to 6
commit 40a93fd
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date: Thu Aug 22 07:12:59 2024 +0000
build(deps): bump golangci/golangci-lint-action from 3 to 6
Bumps [golangci/golangci-lint-action](https://github.com/golangci/golangci-lint-action) from 3 to 6.
- [Release notes](https://github.com/golangci/golangci-lint-action/releases)
- [Commits](https://github.com/golangci/golangci-lint-action/compare/v3...v6)
---
updated-dependencies:
- dependency-name: golangci/golangci-lint-action
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
commit 90fd147
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date: Thu Aug 22 07:12:58 2024 +0000
build(deps): bump actions/setup-go from 4 to 5
Bumps [actions/setup-go](https://github.com/actions/setup-go) from 4 to 5.
- [Release notes](https://github.com/actions/setup-go/releases)
- [Commits](https://github.com/actions/setup-go/compare/v4...v5)
---
updated-dependencies:
- dependency-name: actions/setup-go
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
commit 4c3bcc3
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date: Thu Aug 22 07:12:55 2024 +0000
build(deps): bump github/codeql-action from 2 to 3
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 2 to 3.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](https://github.com/github/codeql-action/compare/v2...v3)
---
updated-dependencies:
- dependency-name: github/codeql-action
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
commit 2340040
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date: Thu Aug 22 07:12:52 2024 +0000
build(deps): bump actions/checkout from 3 to 4
Bumps [actions/checkout](https://github.com/actions/checkout) from 3 to 4.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/v3...v4)
---
updated-dependencies:
- dependency-name: actions/checkout
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
commit 0970b2a
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Tue Aug 20 13:50:17 2024 -0400
strcase: re-org Makefile for that 'make' runs the all target
commit b93da81
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Tue Aug 20 12:56:04 2024 -0400
strcase: add TestIndexNumeric
Add TestIndexNumeric to make sure that our use of bytealg.IndexString is
correct.
commit b5c9ed7
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Tue Aug 20 12:55:26 2024 -0400
internal/bytealg: remove unused MaxLen to fix go1.23 amd64 build
commit f304d44
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Fri Aug 16 18:17:41 2024 -0400
internal/gen: ignore DATA directory
commit 61de6d9
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Fri Aug 16 18:15:54 2024 -0400
all: update golangci-lint version and config
This also changes the Makefiles to use the config when running
golangci-lint instead of passing all arguments on the command line.
commit d784660
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Fri Aug 16 18:06:43 2024 -0400
internal/gen/util: fix GenTablesRoot test
commit 8403f63
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Fri Aug 16 15:06:12 2024 -0400
gen: print usage when "-help" flag is provided
commit 0e37f64
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Fri Aug 16 14:52:15 2024 -0400
internal/gen/util: fix root tables path
commit 9433139
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Thu Aug 15 23:13:26 2024 -0400
internal/gen/gentables: update comments
commit 09f6367
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Thu Aug 15 23:08:22 2024 -0400
internal/gen/gentables: hash all relative go source files
commit 9761d95
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Thu Aug 15 22:38:27 2024 -0400
internal/tables: add copyright
commit 982ae24
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Thu Aug 15 22:36:48 2024 -0400
tests: remove dead code from fuzz test
commit 4a6b49b
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Thu Aug 15 22:35:56 2024 -0400
tests: remove bruteForceIndexUnicode tests
commit 9f83037
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Thu Aug 15 22:34:35 2024 -0400
tests: use regexp to verify generated tests
Use a case-insensitive regex to verify the generated tests when we don't
have a match. This is slow, but accurate and helps us identify when the
test generation logic is broken.
commit 5b5b84f
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Thu Aug 15 22:33:25 2024 -0400
strcase: directly call bytealg.IndexString
commit 7a636d1
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Thu Aug 15 22:27:52 2024 -0400
all: move Unicode tables to internal/tables
This is necessary to share the generated tables between the strcase
package and as a yest to be created bytcase package.
commit 9c5f28b
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Sun Aug 11 14:11:04 2024 -0500
internal/benchtest: exclude IndexNonASCII from benchstat result
By default, exclude the IndexNonASCII benchmarks in the benchstat result
produced by the release target since they distort the overall delta.
commit 9f96d95
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Sun Aug 11 13:37:56 2024 -0500
internal/benchtest/Makefile: fix variable passing in release target
Previously, for the "release" target we shelled out to make so that we
could tee the output to a file, which makes passing any variables a pain
(since we need to pass all of them when invoking the make child
process). This commit changes the Makefile so that the existing targets
can conditionally tee to a file - thus we don't have to shell out.
commit 47e4394
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Sun Aug 11 13:35:49 2024 -0500
internal/benchtest: set the number of bytes processed in some benchmarks
Reporting the number of bytes processed makes it easier to understand
the relative performance of different functions / makes it possible to
understand the processing speed of our search functions.
commit 994f244
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Fri Aug 9 15:27:31 2024 -0500
internal/benchtest: add new benchmark
```
goos: darwin
goarch: arm64
pkg: github.com/charlievieth/strcase/internal/benchtest
│ old.txt │ new.txt │
│ sec/op │ sec/op vs base │
IndexRune-10 12.09n ± 0% 11.08n ± 1% -8.35% (p=0.002 n=6)
IndexRuneLongString-10 13.38n ± 1% 12.59n ± 1% -5.87% (p=0.002 n=6)
IndexRuneFastPath-10 3.038n ± 5% 5.067n ± 5% +66.78% (p=0.002 n=6)
Index-10 3.148n ± 1% 5.072n ± 2% +61.09% (p=0.002 n=6)
LastIndex-10 3.509n ± 0% 6.630n ± 1% +88.92% (p=0.002 n=6)
IndexByte-10 2.432n ± 0% 3.650n ± 1% +50.08% (p=0.002 n=6)
EqualFold/ASCII-10 9.268n ± 1% 9.340n ± 1% ~ (p=0.132 n=6)
EqualFold/UnicodePrefix-10 80.05n ± 0% 31.62n ± 1% -60.51% (p=0.002 n=6)
EqualFold/UnicodeSuffix-10 73.67n ± 0% 25.62n ± 4% -65.22% (p=0.002 n=6)
IndexHard1-10 327.6µ ± 1% 333.1µ ± 2% +1.68% (p=0.015 n=6)
IndexHard2-10 330.3µ ± 0% 1323.5µ ± 5% +300.73% (p=0.002 n=6)
IndexHard3-10 359.7µ ± 1% 1330.6µ ± 1% +269.88% (p=0.002 n=6)
IndexHard4-10 1.323m ± 1% 1.326m ± 1% ~ (p=0.937 n=6)
LastIndexHard1-10 1.318m ± 1% 1.333m ± 3% +1.11% (p=0.002 n=6)
LastIndexHard2-10 1.310m ± 0% 1.326m ± 1% +1.17% (p=0.002 n=6)
LastIndexHard3-10 1.311m ± 0% 1.324m ± 1% +0.97% (p=0.041 n=6)
CountHard1-10 328.9µ ± 0% 329.8µ ± 1% ~ (p=0.485 n=6)
CountHard2-10 329.7µ ± 0% 1311.9µ ± 2% +297.91% (p=0.002 n=6)
CountHard3-10 356.3µ ± 0% 1310.8µ ± 4% +267.92% (p=0.002 n=6)
IndexTorture-10 9.863µ ± 0% 17.500µ ± 1% +77.44% (p=0.002 n=6)
CountTorture-10 9.993µ ± 3% 19.997µ ± 1% +100.11% (p=0.002 n=6)
CountTortureOverlapping-10 67.40µ ± 2% 4026.87µ ± 1% +5874.81% (p=0.002 n=6)
CountByte/10-10 6.633n ± 1% 7.546n ± 1% +13.77% (p=0.002 n=6)
CountByte/32-10 3.200n ± 0% 4.080n ± 2% +27.50% (p=0.002 n=6)
CountByte/4K-10 82.03n ± 0% 99.56n ± 2% +21.38% (p=0.002 n=6)
CountByte/4M-10 83.48µ ± 0% 92.69µ ± 1% +11.03% (p=0.002 n=6)
CountByte/64M-10 1.385m ± 1% 1.522m ± 2% +9.92% (p=0.002 n=6)
IndexAnyASCII/1:1-10 4.159n ± 0% 5.423n ± 1% +30.39% (p=0.002 n=6)
IndexAnyASCII/1:2-10 5.461n ± 0% 7.298n ± 1% +33.65% (p=0.002 n=6)
IndexAnyASCII/1:4-10 5.462n ± 0% 7.277n ± 1% +33.23% (p=0.002 n=6)
IndexAnyASCII/1:8-10 5.468n ± 1% 7.271n ± 1% +32.95% (p=0.002 n=6)
IndexAnyASCII/1:16-10 5.492n ± 0% 7.617n ± 1% +38.68% (p=0.002 n=6)
IndexAnyASCII/1:32-10 5.474n ± 0% 7.625n ± 1% +39.28% (p=0.002 n=6)
IndexAnyASCII/1:64-10 6.133n ± 1% 7.962n ± 1% +29.82% (p=0.002 n=6)
IndexAnyASCII/16:1-10 4.165n ± 1% 5.397n ± 1% +29.61% (p=0.002 n=6)
IndexAnyASCII/16:2-10 11.44n ± 1% 17.21n ± 2% +50.48% (p=0.002 n=6)
IndexAnyASCII/16:4-10 12.71n ± 1% 19.97n ± 2% +57.10% (p=0.002 n=6)
IndexAnyASCII/16:8-10 17.65n ± 1% 22.81n ± 1% +29.24% (p=0.002 n=6)
IndexAnyASCII/16:16-10 35.01n ± 1% 34.04n ± 1% -2.76% (p=0.002 n=6)
IndexAnyASCII/16:32-10 69.00n ± 0% 62.38n ± 1% -9.60% (p=0.002 n=6)
IndexAnyASCII/16:64-10 136.5n ± 0% 124.7n ± 1% -8.64% (p=0.002 n=6)
IndexAnyASCII/256:1-10 7.414n ± 1% 8.560n ± 1% +15.46% (p=0.002 n=6)
IndexAnyASCII/256:2-10 155.3n ± 1% 175.2n ± 1% +12.81% (p=0.002 n=6)
IndexAnyASCII/256:4-10 158.0n ± 0% 176.0n ± 2% +11.39% (p=0.002 n=6)
IndexAnyASCII/256:8-10 162.3n ± 1% 181.1n ± 1% +11.58% (p=0.002 n=6)
IndexAnyASCII/256:16-10 173.4n ± 0% 193.9n ± 1% +11.82% (p=0.002 n=6)
IndexAnyASCII/256:32-10 207.6n ± 1% 225.2n ± 1% +8.45% (p=0.002 n=6)
IndexAnyASCII/256:64-10 275.4n ± 1% 291.2n ± 1% +5.77% (p=0.002 n=6)
IndexAnyUTF8/1:1-10 3.204n ± 0% 3.150n ± 1% -1.65% (p=0.002 n=6)
IndexAnyUTF8/1:2-10 5.421n ± 1% 7.258n ± 1% +33.87% (p=0.002 n=6)
IndexAnyUTF8/1:4-10 5.445n ± 1% 7.332n ± 2% +34.66% (p=0.002 n=6)
IndexAnyUTF8/1:8-10 5.444n ± 0% 7.299n ± 1% +34.07% (p=0.002 n=6)
IndexAnyUTF8/1:16-10 5.458n ± 0% 7.328n ± 1% +34.25% (p=0.002 n=6)
IndexAnyUTF8/1:32-10 5.438n ± 0% 7.649n ± 1% +40.65% (p=0.002 n=6)
IndexAnyUTF8/1:64-10 6.078n ± 0% 7.981n ± 1% +31.32% (p=0.002 n=6)
IndexAnyUTF8/16:1-10 13.26n ± 0% 13.25n ± 1% ~ (p=0.823 n=6)
IndexAnyUTF8/16:2-10 65.34n ± 1% 33.37n ± 1% -48.93% (p=0.002 n=6)
IndexAnyUTF8/16:4-10 65.84n ± 1% 32.83n ± 1% -50.14% (p=0.002 n=6)
IndexAnyUTF8/16:8-10 65.52n ± 1% 87.46n ± 1% +33.48% (p=0.002 n=6)
IndexAnyUTF8/16:16-10 65.30n ± 1% 87.67n ± 1% +34.26% (p=0.002 n=6)
IndexAnyUTF8/16:32-10 68.18n ± 0% 92.09n ± 1% +35.06% (p=0.002 n=6)
IndexAnyUTF8/16:64-10 76.15n ± 3% 98.08n ± 1% +28.80% (p=0.002 n=6)
IndexAnyUTF8/256:1-10 171.9n ± 2% 169.9n ± 3% ~ (p=0.056 n=6)
IndexAnyUTF8/256:2-10 912.2n ± 2% 356.4n ± 1% -60.93% (p=0.002 n=6)
IndexAnyUTF8/256:4-10 908.4n ± 0% 192.6n ± 2% -78.80% (p=0.002 n=6)
IndexAnyUTF8/256:8-10 908.5n ± 0% 391.0n ± 1% -56.96% (p=0.002 n=6)
IndexAnyUTF8/256:16-10 909.2n ± 0% 118.0n ± 1% -87.03% (p=0.002 n=6)
IndexAnyUTF8/256:32-10 952.8n ± 1% 612.4n ± 1% -35.72% (p=0.002 n=6)
IndexAnyUTF8/256:64-10 1077.0n ± 0% 707.5n ± 1% -34.31% (p=0.002 n=6)
LastIndexAnyASCII/1:1-10 4.475n ± 0% 5.706n ± 1% +27.51% (p=0.002 n=6)
LastIndexAnyASCII/1:2-10 4.486n ± 0% 5.713n ± 1% +27.35% (p=0.002 n=6)
LastIndexAnyASCII/1:4-10 4.484n ± 1% 5.713n ± 1% +27.41% (p=0.002 n=6)
LastIndexAnyASCII/1:8-10 4.483n ± 0% 5.719n ± 1% +27.57% (p=0.002 n=6)
LastIndexAnyASCII/1:16-10 4.488n ± 0% 6.037n ± 1% +34.52% (p=0.002 n=6)
LastIndexAnyASCII/1:32-10 4.480n ± 0% 6.033n ± 1% +34.67% (p=0.002 n=6)
LastIndexAnyASCII/1:64-10 5.101n ± 0% 6.364n ± 1% +24.76% (p=0.002 n=6)
LastIndexAnyASCII/16:1-10 10.10n ± 1% 11.48n ± 1% +13.66% (p=0.002 n=6)
LastIndexAnyASCII/16:2-10 10.82n ± 1% 12.49n ± 1% +15.53% (p=0.002 n=6)
LastIndexAnyASCII/16:4-10 12.50n ± 0% 14.53n ± 1% +16.24% (p=0.002 n=6)
LastIndexAnyASCII/16:8-10 17.56n ± 1% 19.05n ± 4% +8.48% (p=0.002 n=6)
LastIndexAnyASCII/16:16-10 33.95n ± 0% 34.02n ± 1% ~ (p=0.697 n=6)
LastIndexAnyASCII/16:32-10 66.70n ± 0% 62.04n ± 1% -6.99% (p=0.002 n=6)
LastIndexAnyASCII/16:64-10 133.5n ± 2% 125.3n ± 1% -6.14% (p=0.002 n=6)
LastIndexAnyASCII/256:1-10 149.9n ± 1% 155.6n ± 1% +3.80% (p=0.002 n=6)
LastIndexAnyASCII/256:2-10 150.6n ± 0% 154.2n ± 1% +2.39% (p=0.002 n=6)
LastIndexAnyASCII/256:4-10 152.3n ± 0% 155.0n ± 1% +1.77% (p=0.002 n=6)
LastIndexAnyASCII/256:8-10 156.8n ± 3% 161.0n ± 2% ~ (p=0.065 n=6)
LastIndexAnyASCII/256:16-10 168.4n ± 6% 171.7n ± 3% ~ (p=0.061 n=6)
LastIndexAnyASCII/256:32-10 199.1n ± 0% 209.4n ± 0% +5.17% (p=0.002 n=6)
LastIndexAnyASCII/256:64-10 265.6n ± 0% 269.0n ± 1% +1.30% (p=0.002 n=6)
LastIndexAnyUTF8/1:1-10 4.460n ± 1% 5.741n ± 0% +28.72% (p=0.002 n=6)
LastIndexAnyUTF8/1:2-10 4.462n ± 8% 5.716n ± 1% +28.08% (p=0.002 n=6)
LastIndexAnyUTF8/1:4-10 4.461n ± 1% 5.737n ± 1% +28.60% (p=0.002 n=6)
LastIndexAnyUTF8/1:8-10 4.460n ± 1% 5.723n ± 0% +28.32% (p=0.002 n=6)
LastIndexAnyUTF8/1:16-10 4.460n ± 0% 5.722n ± 0% +28.30% (p=0.002 n=6)
LastIndexAnyUTF8/1:32-10 4.462n ± 0% 6.038n ± 0% +35.35% (p=0.002 n=6)
LastIndexAnyUTF8/1:64-10 5.090n ± 0% 6.358n ± 0% +24.89% (p=0.002 n=6)
LastIndexAnyUTF8/16:1-10 26.38n ± 0% 27.57n ± 2% +4.51% (p=0.002 n=6)
LastIndexAnyUTF8/16:2-10 76.37n ± 0% 97.27n ± 1% +27.37% (p=0.002 n=6)
LastIndexAnyUTF8/16:4-10 76.05n ± 0% 97.49n ± 0% +28.20% (p=0.002 n=6)
LastIndexAnyUTF8/16:8-10 76.26n ± 1% 97.94n ± 1% +28.44% (p=0.002 n=6)
LastIndexAnyUTF8/16:16-10 76.17n ± 0% 97.59n ± 1% +28.12% (p=0.002 n=6)
LastIndexAnyUTF8/16:32-10 80.61n ± 1% 102.95n ± 7% +27.71% (p=0.002 n=6)
LastIndexAnyUTF8/16:64-10 87.45n ± 1% 109.70n ± 0% +25.44% (p=0.002 n=6)
LastIndexAnyUTF8/256:1-10 567.6n ± 1% 555.4n ± 1% -2.16% (p=0.002 n=6)
LastIndexAnyUTF8/256:2-10 1.074µ ± 0% 1.389µ ± 4% +29.39% (p=0.002 n=6)
LastIndexAnyUTF8/256:4-10 1.076µ ± 0% 1.399µ ± 3% +30.02% (p=0.002 n=6)
LastIndexAnyUTF8/256:8-10 1.076µ ± 1% 1.398µ ± 1% +29.93% (p=0.002 n=6)
LastIndexAnyUTF8/256:16-10 1.068µ ± 1% 1.394µ ± 1% +30.48% (p=0.002 n=6)
LastIndexAnyUTF8/256:32-10 1.147µ ± 0% 1.471µ ± 1% +28.30% (p=0.002 n=6)
LastIndexAnyUTF8/256:64-10 1.244µ ± 0% 1.591µ ± 0% +27.85% (p=0.002 n=6)
IndexPeriodic/IndexPeriodic2-10 20.60µ ± 0% 82.77µ ± 1% +301.86% (p=0.002 n=6)
IndexPeriodic/IndexPeriodic4-10 20.61µ ± 0% 83.02µ ± 1% +302.81% (p=0.002 n=6)
IndexPeriodic/IndexPeriodic8-10 20.62µ ± 0% 82.80µ ± 1% +301.55% (p=0.002 n=6)
IndexPeriodic/IndexPeriodic16-10 56.46µ ± 2% 61.40µ ± 2% +8.74% (p=0.002 n=6)
IndexPeriodic/IndexPeriodic32-10 29.77µ ± 0% 32.36µ ± 4% +8.68% (p=0.002 n=6)
IndexPeriodic/IndexPeriodic64-10 15.06µ ± 0% 16.48µ ± 3% +9.45% (p=0.002 n=6)
IndexByte_Bytes/10-10 3.587n ± 1% 4.486n ± 0% +25.08% (p=0.002 n=6)
IndexByte_Bytes/32-10 3.585n ± 0% 4.492n ± 1% +25.27% (p=0.002 n=6)
IndexByte_Bytes/4K-10 71.12n ± 0% 80.35n ± 3% +12.97% (p=0.002 n=6)
IndexByte_Bytes/4M-10 63.48µ ± 1% 74.53µ ± 1% +17.40% (p=0.002 n=6)
IndexByte_Bytes/64M-10 1.123m ± 5% 1.275m ± 7% +13.47% (p=0.002 n=6)
IndexRune_Bytes/10-10 10.40n ± 1% 12.45n ± 0% +19.66% (p=0.002 n=6)
IndexRune_Bytes/32-10 12.23n ± 0% 12.50n ± 4% +2.21% (p=0.002 n=6)
IndexRune_Bytes/4K-10 82.98n ± 1% 83.82n ± 0% ~ (p=0.065 n=6)
IndexRune_Bytes/4M-10 64.26µ ± 1% 64.71µ ± 1% +0.70% (p=0.002 n=6)
IndexRune_Bytes/64M-10 1.110m ± 0% 1.142m ± 5% +2.88% (p=0.002 n=6)
IndexRuneASCII_Bytes/10-10 3.688n ± 0% 6.413n ± 9% +73.90% (p=0.002 n=6)
IndexRuneASCII_Bytes/32-10 3.688n ± 0% 6.397n ± 0% +73.43% (p=0.002 n=6)
IndexRuneASCII_Bytes/4K-10 71.59n ± 0% 83.45n ± 1% +16.57% (p=0.002 n=6)
IndexRuneASCII_Bytes/4M-10 63.93µ ± 3% 74.08µ ± 1% +15.87% (p=0.002 n=6)
IndexRuneASCII_Bytes/64M-10 1.114m ± 0% 1.224m ± 2% +9.82% (p=0.002 n=6)
geomean 179.0n 216.1n +20.73%
│ old.txt │ new.txt │
│ B/s │ B/s vs base │
Index-10 5.324Gi ± 1% 3.305Gi ± 2% -37.92% (p=0.002 n=6)
IndexHard1-10 2.981Gi ± 1% 2.932Gi ± 2% -1.65% (p=0.015 n=6)
IndexHard2-10 3027.9Mi ± 0% 755.6Mi ± 5% -75.05% (p=0.002 n=6)
IndexHard3-10 2779.7Mi ± 1% 751.6Mi ± 1% -72.96% (p=0.002 n=6)
IndexHard4-10 756.1Mi ± 1% 754.1Mi ± 1% ~ (p=0.937 n=6)
IndexTorture-10 594.4Mi ± 0% 335.0Mi ± 1% -43.64% (p=0.002 n=6)
CountByte/10-10 1.404Gi ± 1% 1.234Gi ± 1% -12.11% (p=0.002 n=6)
CountByte/32-10 9.314Gi ± 0% 7.305Gi ± 2% -21.56% (p=0.002 n=6)
CountByte/4K-10 46.50Gi ± 0% 38.31Gi ± 2% -17.61% (p=0.002 n=6)
CountByte/4M-10 46.79Gi ± 0% 42.15Gi ± 1% -9.93% (p=0.002 n=6)
CountByte/64M-10 45.14Gi ± 1% 41.07Gi ± 2% -9.02% (p=0.002 n=6)
IndexPeriodic/IndexPeriodic2-10 3034.4Mi ± 0% 755.1Mi ± 1% -75.12% (p=0.002 n=6)
IndexPeriodic/IndexPeriodic4-10 3032.7Mi ± 0% 752.9Mi ± 1% -75.17% (p=0.002 n=6)
IndexPeriodic/IndexPeriodic8-10 3030.8Mi ± 0% 754.8Mi ± 1% -75.10% (p=0.002 n=6)
IndexPeriodic/IndexPeriodic16-10 1107.0Mi ± 2% 1018.0Mi ± 2% -8.04% (p=0.002 n=6)
IndexPeriodic/IndexPeriodic32-10 2.050Gi ± 0% 1.886Gi ± 4% -7.99% (p=0.002 n=6)
IndexPeriodic/IndexPeriodic64-10 4.053Gi ± 0% 3.703Gi ± 2% -8.63% (p=0.002 n=6)
IndexByte_Bytes/10-10 2.597Gi ± 1% 2.076Gi ± 0% -20.05% (p=0.002 n=6)
IndexByte_Bytes/32-10 8.312Gi ± 0% 6.635Gi ± 1% -20.17% (p=0.002 n=6)
IndexByte_Bytes/4K-10 53.63Gi ± 0% 47.48Gi ± 3% -11.48% (p=0.002 n=6)
IndexByte_Bytes/4M-10 61.53Gi ± 1% 52.41Gi ± 1% -14.82% (p=0.002 n=6)
IndexByte_Bytes/64M-10 55.64Gi ± 5% 49.04Gi ± 7% -11.87% (p=0.002 n=6)
IndexRune_Bytes/10-10 916.9Mi ± 1% 766.3Mi ± 0% -16.43% (p=0.002 n=6)
IndexRune_Bytes/32-10 2.437Gi ± 0% 2.385Gi ± 4% -2.15% (p=0.002 n=6)
IndexRune_Bytes/4K-10 45.97Gi ± 1% 45.51Gi ± 0% ~ (p=0.065 n=6)
IndexRune_Bytes/4M-10 60.79Gi ± 1% 60.37Gi ± 1% -0.70% (p=0.002 n=6)
IndexRune_Bytes/64M-10 56.30Gi ± 0% 54.73Gi ± 5% -2.79% (p=0.002 n=6)
IndexRuneASCII_Bytes/10-10 2.525Gi ± 0% 1.452Gi ± 9% -42.50% (p=0.002 n=6)
IndexRuneASCII_Bytes/32-10 8.080Gi ± 0% 4.659Gi ± 0% -42.34% (p=0.002 n=6)
IndexRuneASCII_Bytes/4K-10 53.29Gi ± 0% 45.71Gi ± 1% -14.22% (p=0.002 n=6)
IndexRuneASCII_Bytes/4M-10 61.10Gi ± 3% 52.73Gi ± 1% -13.69% (p=0.002 n=6)
IndexRuneASCII_Bytes/64M-10 56.10Gi ± 0% 51.08Gi ± 2% -8.95% (p=0.002 n=6)
geomean 8.012Gi 5.582Gi -30.33%
```
commit c2db11c
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Fri Aug 9 11:30:13 2024 -0500
strcase: inline indexUnicode into Index
commit 5d0f2d0
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Fri Aug 9 11:28:16 2024 -0500
strcase: cleanup and test nonLetterASCII
commit 441d39c
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Thu Aug 8 15:49:09 2024 -0600
strcase: remove ASCII check from indexByte
Remove the ASCII check because it was slower and was generally a bad
idead in the first place since indexRuneCase is faster and the bytes
being sought cannot occur in an ASCII only string.
TLDR: We we're potentially scanning the haystack twice for no benefit.
commit b0e1f9a
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Thu Aug 8 15:22:22 2024 -0600
strcase: improve indexRuneCase performance by ~45%
This changes indexRuneCase to search for runes using the last byte of
their UTF-8 encoded form and to use bytealg.IndexString if IndexByte
produces too many false positives.
goos: darwin
goarch: arm64
pkg: github.com/charlievieth/strcase
│ base.10.txt │ new.10.txt │
│ sec/op │ sec/op vs base │
IndexRuneCaseUnicode/Latin/10-10 5.510n ± 0% 5.443n ± 1% -1.22% (p=0.011 n=10)
IndexRuneCaseUnicode/Latin/32-10 6.022n ± 0% 5.930n ± 0% -1.54% (p=0.001 n=10)
IndexRuneCaseUnicode/Latin/4K-10 329.8n ± 0% 333.7n ± 2% +1.20% (p=0.000 n=10)
IndexRuneCaseUnicode/Latin/4M-10 367.2µ ± 0% 369.5µ ± 2% +0.63% (p=0.019 n=10)
IndexRuneCaseUnicode/Latin/64M-10 5.916m ± 0% 5.938m ± 0% +0.38% (p=0.035 n=10)
IndexRuneCaseUnicode/Cyrillic/10-10 6.391n ± 2% 6.098n ± 1% -4.58% (p=0.000 n=10)
IndexRuneCaseUnicode/Cyrillic/32-10 6.983n ± 1% 6.527n ± 0% -6.53% (p=0.000 n=10)
IndexRuneCaseUnicode/Cyrillic/4K-10 3.205µ ± 0% 1.038µ ± 1% -67.62% (p=0.000 n=10)
IndexRuneCaseUnicode/Cyrillic/4M-10 3.327m ± 1% 1.135m ± 0% -65.87% (p=0.000 n=10)
IndexRuneCaseUnicode/Cyrillic/64M-10 53.41m ± 1% 18.25m ± 1% -65.83% (p=0.000 n=10)
IndexRuneCaseUnicode/Han/10-10 7.027n ± 1% 7.845n ± 1% +11.63% (p=0.000 n=10)
IndexRuneCaseUnicode/Han/32-10 7.551n ± 1% 8.266n ± 1% +9.47% (p=0.000 n=10)
IndexRuneCaseUnicode/Han/4K-10 982.4n ± 1% 484.3n ± 1% -50.70% (p=0.000 n=10)
IndexRuneCaseUnicode/Han/4M-10 2017.7µ ± 0% 841.4µ ± 1% -58.30% (p=0.000 n=10)
IndexRuneCaseUnicode/Han/64M-10 32.61m ± 1% 13.53m ± 2% -58.52% (p=0.000 n=10)
geomean 4.179µ 2.866µ -31.42%
│ base.10.txt │ new.10.txt │
│ B/s │ B/s vs base │
IndexRuneCaseUnicode/Latin/10-10 1.690Gi ± 0% 1.711Gi ± 1% +1.23% (p=0.011 n=10)
IndexRuneCaseUnicode/Latin/32-10 4.948Gi ± 0% 5.026Gi ± 0% +1.57% (p=0.002 n=10)
IndexRuneCaseUnicode/Latin/4K-10 11.57Gi ± 0% 11.43Gi ± 2% -1.18% (p=0.000 n=10)
IndexRuneCaseUnicode/Latin/4M-10 10.64Gi ± 0% 10.57Gi ± 2% -0.62% (p=0.019 n=10)
IndexRuneCaseUnicode/Latin/64M-10 10.57Gi ± 0% 10.53Gi ± 0% -0.38% (p=0.035 n=10)
IndexRuneCaseUnicode/Cyrillic/10-10 1.457Gi ± 2% 1.527Gi ± 1% +4.80% (p=0.000 n=10)
IndexRuneCaseUnicode/Cyrillic/32-10 4.268Gi ± 1% 4.566Gi ± 0% +6.98% (p=0.000 n=10)
IndexRuneCaseUnicode/Cyrillic/4K-10 1.190Gi ± 0% 3.677Gi ± 1% +208.89% (p=0.000 n=10)
IndexRuneCaseUnicode/Cyrillic/4M-10 1.174Gi ± 1% 3.440Gi ± 0% +193.04% (p=0.000 n=10)
IndexRuneCaseUnicode/Cyrillic/64M-10 1.170Gi ± 1% 3.425Gi ± 1% +192.65% (p=0.000 n=10)
IndexRuneCaseUnicode/Han/10-10 1.325Gi ± 1% 1.187Gi ± 1% -10.42% (p=0.000 n=10)
IndexRuneCaseUnicode/Han/32-10 3.947Gi ± 1% 3.605Gi ± 1% -8.65% (p=0.000 n=10)
IndexRuneCaseUnicode/Han/4K-10 3.883Gi ± 1% 7.877Gi ± 1% +102.85% (p=0.000 n=10)
IndexRuneCaseUnicode/Han/4M-10 1.936Gi ± 0% 4.643Gi ± 1% +139.81% (p=0.000 n=10)
IndexRuneCaseUnicode/Han/64M-10 1.916Gi ± 1% 4.620Gi ± 2% +141.08% (p=0.000 n=10)
geomean 2.893Gi 4.219Gi +45.82%
commit e7a38b5
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Wed Aug 7 15:34:57 2024 -0600
strcase_test: fix BenchmarkLastIndexAnyASCII
commit 9ac8402
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Wed Aug 7 15:34:10 2024 -0600
strcase_test: fix shadow variable declarations
commit 781528f
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Wed Aug 7 15:15:42 2024 -0600
internal/cstr: remove internal cstr package
We no longer need it for tests. This internal package was only used in
early development as a sanity check.
commit 225b624
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Wed Aug 7 15:05:49 2024 -0600
strcase: minor: use global benchmark buffer
commit 1bfd9ee
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Wed Aug 7 15:05:23 2024 -0600
strcase: remove unused bmIndexRabinKarpUnicode function
commit e31d224
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Wed Aug 7 11:36:45 2024 -0600
strcase: remove unused uint16Len4 function
commit 3be3db4
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Wed Aug 7 11:36:01 2024 -0600
strcase: simplify lastIndexRune
commit 3a37346
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Tue Aug 6 23:16:00 2024 -0600
calibrate: comment out test
TODO: we can probably remove this file and the associated tests
commit ce3af67
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Mon Aug 5 14:49:39 2024 -0400
strcase: fixup BenchmarkIndexByteLongSpecial and remove BenchmarkIndexByteLong
commit 6dff5ce
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Sun Aug 4 23:19:26 2024 -0400
strcase: general improvements
commit ef9acd3
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Sun Aug 4 14:53:58 2024 -0400
Makefile: update golangci-lint version
commit 431b440
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Sun Aug 4 14:53:08 2024 -0400
internal/benchtest: add more ToLower benchmarks
commit ce605ba
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Sun Aug 4 14:52:32 2024 -0400
internal/bytealg: fix type in comment
commit 6d1e0b9
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Wed Jul 31 21:29:01 2024 -0400
gh: test on go1.22
commit 400b4d1
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Wed Jul 31 20:17:38 2024 -0400
LICENSE update year
commit 4b3d669
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Wed Jul 31 20:09:42 2024 -0400
mod: set minimum go version to 1.19
Supporting Go 1.18 requires adding another assembly count implementation
for AMD64 since asm_amd64.h was added in go1.19 (it handles AMD64
feature detection).
commit 6694aeb
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Wed Jul 31 20:09:24 2024 -0400
Revert "gh: test minimum supported Go version"
This reverts commit 8356823c30c2c3dfdcf5b63eb59bf030b0e678d3.
commit 8356823
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Wed Jul 31 19:49:58 2024 -0400
gh: test minimum supported Go version
commit 2058c33
Author: Charlie Vieth charlie.vieth@gmail.com
Date: Fri Jul 26 00:15:52 2024 -0400
strcase: don't mask when using lower case lookup table