Skip to content

Commit

Permalink
15 automatic chunking (#24)
Browse files Browse the repository at this point in the history
* perf: benchmark timeout

running experiments

* perf: benchmark caption

* fix: caption wording

* fix: caption typo

* fix: caption mention of average trials

* fix: strict in caption

* fix: center alignment

* fix: distribution
  • Loading branch information
DiTo97 authored Dec 9, 2024
1 parent 9f93013 commit a8db27e
Show file tree
Hide file tree
Showing 16 changed files with 129 additions and 1,029 deletions.
2 changes: 0 additions & 2 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,8 @@ repos:
hooks:
- id: ruff
args: [--fix, --exit-non-zero-on-fix]
files: ^(src|tests)
- id: ruff-format
exclude: ^(docs)
files: ^(src|tests)

- repo: https://github.com/astral-sh/uv-pre-commit
rev: 0.5.0
Expand Down
14 changes: 10 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,11 +42,11 @@ string = """\
if __name__ == "__main__":
encoding_base26 = base26_encode(string)
print(encoding_base26)
# >>> YBPNLKVNQWZQCMDHMLNDTVQCCRKQLNCFGMQPNGQCIXHUUPHFUNKUFEPDLKIGARFOKTDEZKQHXGCPYHDZKKVIUDNFOAYYAUOQFBJFFGSTKAXNWGDPVUJNBARPNXBASHZBXIBSSEFTAIQRPEADSOVVNXUMQXVDWTAIVCIVWQZAHAGYAVZYKGMETJOOUQNOEXMSOOGSKVMFBYZIBZDAITICYVXMJTTCCHPMSCABLYUMFDUNLVSLNKHSBPKCGASXJSFYDHZFAOEQTUACEBIFKQGYC
# >>> ["YBPNLKVNQWZQCMDHMLNDTVQCCRKQLNCFGMQPNGQCIXHUUPHFUNKUFEPDLKIGARFOKTDEZKQHXGCPYHDZKKVIUDNFOAYYAUOQFBJFFGSTKAXNWGDPVUJNBARPNXBASHZBXIBSSEFTAIQRPEADSOVVNXUMQXVDWTAIVCIVWQZAHAGYAVZYKGMETJOOUQNOEXMSOOGSKVMFBYZIBZDAITICYVXMJTTCCHPMSCABLYUMFDUNLVSLNKHSBPKCGASXJSFYDHZFAOEQTUACEBIFKQGYC"]

encoding_base52 = base52_encode(string)
print(encoding_base52)
# >>> EgcgYRPxckylMQWRLDADNZxPJiJcHaVwYHLnicahBgaotGGANZuvsvcpSSOJFLXvKPjRlNQCJqqdviiIdtnwJyDOnWojsrpkWSTZFHbMIREvREjpsODtSxoLlLjQZOoehsGFzawGQecyuomgpZQNyFnZQLWPiDhzClwxBFCCwdqduGJoshrwFdwHWMtJpSTmjxzaYmNvzOIOwLkJvyQHCaFtrODPhbhBpPBmC
# >>> ["EgcgYRPxckylMQWRLDADNZxPJiJcHaVwYHLnicahBgaotGGANZuvsvcpSSOJFLXvKPjRlNQCJqqdviiIdtnwJyDOnWojsrpkWSTZFHbMIREvREjpsODtSxoLlLjQZOoehsGFzawGQecyuomgpZQNyFnZQLWPiDhzClwxBFCCwdqduGJoshrwFdwHWMtJpSTmjxzaYmNvzOIOwLkJvyQHCaFtrODPhbhBpPBmC"]

assert base26_decode(encoding_base26) == string
assert base52_decode(encoding_base52) == string
Expand All @@ -56,11 +56,17 @@ if __name__ == "__main__":

The library is inspired by [R. Heaton](https://github.com/robert)'s base26 implementation and his story of manipulating data transmission in restrictive network channels on long-distance flights using alphabetic-only encodings and tokenization.

have a look at the original [repository](https://github.com/robert/pyskywifi) and [blog post](https://robertheaton.com/pyskywifi) and show him some love!
have a look at the original [repository](https://github.com/robert/pyskywifi) and story [blog post](https://robertheaton.com/pyskywifi) and show him some love.

## 📊 benchmarking

TBC <!-- HTML string of almost 2.5M characters -->
our implementation is orders of magnitude more efficient on 100k+ strings:

<div align="center">
<img src="resources/benchmark.png" alt="benchmarking">

*Figure 1: runtime and memory usage performance against Heaton's original implementation with and without automatic chunking and SIMD on variable-length strings with a strict 60-second timeout; average over 5 trials.*
</div>

## 🤝 contributing

Expand Down
24 changes: 0 additions & 24 deletions benchmark/__main__.py

This file was deleted.

41 changes: 0 additions & 41 deletions benchmark/plotting.py

This file was deleted.

15 changes: 0 additions & 15 deletions benchmark/profiling.py

This file was deleted.

11 changes: 6 additions & 5 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ build-backend = "hatchling.build"

[project]
name = "alphacodings"
version = "0.1.0"
version = "0.2.0"
description = "base26 ([A-Z]) and base52 ([A-Za-z]) encodings"
readme = "README.md"
authors = [{name = "Federico Minutoli", email = "fede97.minutoli@gmail.com"}]
Expand All @@ -16,16 +16,15 @@ dependencies = ["gmpy2>1"]


[dependency-groups]
contrib = ["matplotlib>3", "memory-profiler>0", "pre-commit>4", "tqdm>4"]
testing = ["pytest>8"]
contrib = ["pre-commit>4", "pytest>8"]


[tool.uv]
default-groups = ["contrib", "testing"]
default-groups = ["contrib"]


[tool.ruff]
src = ["src"]
src = ["src", "tests"]
exclude = [
".git-rewrite",
".git",
Expand Down Expand Up @@ -76,8 +75,10 @@ extend-select = [
"YTT",
]
ignore = [
"E741",
"PLR09",
"PLR2004", # magic comparison
"RET504",
]

[tool.ruff.lint.isort]
Expand Down
Loading

0 comments on commit a8db27e

Please sign in to comment.