Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add out-of-tree Pyodide builds in CI for numcodecs #529

Open
wants to merge 47 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 43 commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
db77ad5
Git-ignore pytest cache folders
agriyakhetarpal Apr 24, 2024
22cede6
Refine outdated sections for Contributing guide
agriyakhetarpal Apr 24, 2024
3c2b0be
Add patch to disable threading and multiprocessing
agriyakhetarpal Apr 24, 2024
a8c956f
Add patch to embed missing POSIX syscall headers
agriyakhetarpal Apr 24, 2024
8769743
Add initial workflow to run + debug Pyodide builds
agriyakhetarpal Apr 24, 2024
28674c3
Disable AVX2 and SSE2 instructions
agriyakhetarpal Apr 24, 2024
f6df488
Use separate build and test jobs and share wheel
agriyakhetarpal Apr 24, 2024
d678be8
Add patch that allows loading `.npy` files
agriyakhetarpal Apr 24, 2024
6ce4127
Apply patch for Emscripten to load `.npy` files
agriyakhetarpal Apr 24, 2024
199928d
Add a constant to check if platform is WASM
agriyakhetarpal Apr 24, 2024
1f5e7b7
Use `is_wasm` to disable multiprocessing tests
agriyakhetarpal Apr 24, 2024
da909f8
Apply `.npy` files patch again when running tests
agriyakhetarpal Apr 24, 2024
a54c555
Remove patch that is unneeded at build time
agriyakhetarpal May 21, 2024
494dd2f
Merge branch 'main' into setup-emscripten-ci
agriyakhetarpal Nov 7, 2024
a7446c8
Delete patch that's not needed now
agriyakhetarpal Nov 7, 2024
9271054
Ignore Pyodide cross-build environment files
agriyakhetarpal Nov 7, 2024
764f34e
Revamp workflow and clean up changes
agriyakhetarpal Nov 7, 2024
d41f75f
Fix linter errors and style failures
agriyakhetarpal Nov 7, 2024
ee39d1e
Catch and disable `mutex` assignment
agriyakhetarpal Nov 7, 2024
731c6b2
Fix patch that disables multiprocessing, threading
agriyakhetarpal Nov 7, 2024
0616dd5
Don't hardcode `zlib` paths in patch details
agriyakhetarpal Nov 7, 2024
1aa838b
Install test + runtime extras, disable pytest cache
agriyakhetarpal Nov 7, 2024
0638a91
Fix shell expansion in installation command
agriyakhetarpal Nov 7, 2024
53f2096
Temporary: skip tests that require crc32c
agriyakhetarpal Nov 8, 2024
7133abb
Merge branch 'main' into setup-emscripten-ci
agriyakhetarpal Nov 8, 2024
bc9b6dc
Temporary: skip `crc32c` dependency until available
agriyakhetarpal Nov 8, 2024
5a4e234
Update ci-emscripten.yaml
agriyakhetarpal Nov 8, 2024
4ce9e4d
Fix zarr installation order to collect more tests
agriyakhetarpal Nov 8, 2024
d6c43ef
Force install zarr version 3 for CI
agriyakhetarpal Nov 8, 2024
55a2690
Skip a test that requires threads
agriyakhetarpal Nov 8, 2024
9a0482b
Temporary: skip Zarr's async functionality tests
agriyakhetarpal Nov 8, 2024
bf42491
Guard import of `multiprocessing`
agriyakhetarpal Nov 8, 2024
9924e8e
Revert "Guard import of `multiprocessing`"
agriyakhetarpal Nov 8, 2024
0847241
Bump verbosity when running test suite
agriyakhetarpal Nov 8, 2024
b6713ca
Skip codec entrypoint test that requires processes
agriyakhetarpal Nov 8, 2024
a590785
Let `pytest` inherit config for doctests
agriyakhetarpal Nov 8, 2024
3e1a968
Revert "Let `pytest` inherit config for doctests"
agriyakhetarpal Nov 8, 2024
e4f6670
Ignore a doctest that uses threads within the file
agriyakhetarpal Nov 8, 2024
4c348f2
Merge `main`
agriyakhetarpal Jan 1, 2025
dacbd09
Fix skip reason error message for entrypoint test
agriyakhetarpal Jan 1, 2025
6caf076
`crc32c` >=2.7 is now available with Pyodide 0.27.0
agriyakhetarpal Jan 1, 2025
85252f4
Bump to Pyodide version 0.27.0
agriyakhetarpal Jan 1, 2025
6b8d139
Clean up changes
agriyakhetarpal Jan 1, 2025
8aee545
Merge branch 'main' into setup-emscripten-ci
agriyakhetarpal Feb 18, 2025
c3e346a
Bring back code-style quoting
agriyakhetarpal Feb 18, 2025
2e0e759
Update to Pyodide 0.27.3, add `pcodec`
agriyakhetarpal Feb 26, 2025
4e6f2b6
Fix drop in coverage
agriyakhetarpal Feb 27, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
97 changes: 97 additions & 0 deletions .github/workflows/ci-emscripten.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
# Attributed to NumPy https://github.com/numpy/numpy/pull/25894
# https://github.com/numpy/numpy/blob/d2d2c25fa81b47810f5cbd85ea6485eb3a3ffec3/.github/workflows/emscripten.yml
#

name: Pyodide CI

on:
# TODO: refine after this is ready to merge
[push, pull_request, workflow_dispatch]

Comment on lines +7 to +10
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
on:
# TODO: refine after this is ready to merge
[push, pull_request, workflow_dispatch]
on: [push, pull_request]
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true

env:
FORCE_COLOR: 3
# Disable instructions: AVX2 and SSE2 because Emscripten-specific SIMD
# support has not been implemented yet
DISABLE_NUMCODECS_AVX2: 1
DISABLE_NUMCODECS_SSE2: 1
# Common environment variables for both build and test jobs
PYODIDE_VERSION: 0.27.0
# PYTHON_VERSION and EMSCRIPTEN_VERSION are determined by PYODIDE_VERSION.
# The appropriate versions can be found in the Pyodide repodata.json
# "info" field, or in Makefile.envs:
# https://github.com/pyodide/pyodide/blob/main/Makefile.envs#L2
PYTHON_VERSION: 3.12 # any 3.12.x version works
EMSCRIPTEN_VERSION: 3.1.58
NODE_VERSION: 20

concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true

permissions:
contents: read # to fetch code (actions/checkout)

jobs:
build-wasm-emscripten:
name: Build numcodecs Pyodide distribution
runs-on: ubuntu-22.04
# To enable this workflow on a fork, comment out:
# FIXME: uncomment after this is ready to merge
# if: github.repository == 'zarr-developers/numcodecs'
steps:
- name: Checkout source
uses: actions/checkout@v4
with:
submodules: recursive

- name: Set up Python ${{ env.PYTHON_VERSION }}
id: setup-python
uses: actions/setup-python@v5
with:
python-version: ${{ env.PYTHON_VERSION }}

- name: Set up Emscripten toolchain
uses: mymindstorm/setup-emsdk@v14
with:
version: ${{ env.EMSCRIPTEN_VERSION }}
actions-cache-folder: emsdk-cache

- name: Apply necessary patch(es)
run: |
patch -p1 < tools/ci/patches/0001-disable-multiprocessing-and-pthreads.patch
patch -p1 < tools/ci/patches/0002-add-missing-unistd-headers.patch -d c-blosc/internal-complibs/zlib-*/

- name: Install pyodide-build
run: python -m pip install pyodide-build

- name: Build numcodecs for Pyodide/WASM
run: pyodide build

- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: ${{ env.NODE_VERSION }}

- name: Set up Pyodide virtual environment and test numcodecs for Pyodide
run: |
# Pin to a specific version of Pyodide to ensure reliability
pyodide xbuildenv install ${{ env.PYODIDE_VERSION }}

# Set up Pyodide virtual environment and activate it
pyodide venv .venv-pyodide
source .venv-pyodide/bin/activate

# For tests in test_zarr3.py
pip install zarr==3.0.0b1

# Install the built numcodecs WASM wheel and relevant dependencies
pip install $(ls dist/*.whl)"[msgpack,crc32c,test,test_extras]"
# TODO: get zfpy built in Pyodide and install it here
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

zfpy has been built in Pyodide, but against NumPy v2 (as described above). There is a chance that the tests should work with NumPy >=2, too if I were to install it separately instead of using the [zfpy] extra. I'm open to test that out if asked.


# Change into a different directory before running tests to avoid
# the test runner picking up the local numcodecs package
cd docs

# Don't use the cache provider plugin, as it doesn't currently work
# with Pyodide: https://github.com/pypa/cibuildwheel/issues/1966
python -m pytest -p no:cacheprovider -svra --pyargs numcodecs
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@ coverage.xml
*,cover
.hypothesis/
cover/
.pytest_cache/

# Cython annotation files
numcodecs/*.html
Expand Down Expand Up @@ -104,3 +105,6 @@ numcodecs/version.py

# Cython generated
numcodecs/*.c

# Pyodide builds
/.pyodide-xbuildenv-*
25 changes: 12 additions & 13 deletions docs/contributing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,7 @@ Creating a branch
Before you do any new work or submit a pull request, please open an issue on GitHub to
report the bug or propose the feature you'd like to add.

It's best to create a new, separate branch for each piece of work you want to do. E.g.::
It's best to create a new, separate branch for each piece of work you want to do. E.g.:

git fetch upstream
git checkout -b shiny-new-feature upstream/main
Expand Down Expand Up @@ -144,12 +144,11 @@ docstrings. The simplest way to run the unit tests is to invoke::

$ pytest -v

NumCodecs currently supports Python 6-3.9, so the above command must
NumCodecs currently supports Python 3.8 and later, so the above command must
succeed before code can be accepted into the main code base.

All tests are automatically run via Travis (Linux) and AppVeyor (Windows) continuous
integration services for every pull request. Tests must pass under both services before
code can be accepted.
All tests are automatically run via continuous integration services for every pull request
provided by GitHub Actions. Tests must pass under both services before code can be accepted.

Code standards
~~~~~~~~~~~~~~
Expand All @@ -163,11 +162,11 @@ Conformance can be checked by running::
Test coverage
~~~~~~~~~~~~~

NumCodecs maintains 100% test coverage under the latest Python stable release (currently
Python 3.9). Both unit tests and docstring doctests are included when computing
coverage. Running ``pytest -v`` will automatically run the test suite with coverage
and produce a coverage report. This should be 100% before code can be accepted into the
main code base.
NumCodecs maintains 100% test coverage under the latest Python stable release.
Both unit tests and docstring doctests are included when computing coverage. Running
``pytest -v`` will automatically run the test suite with coverage and produce a
coverage report. This should be 100% before code can be accepted into the main
code base.

When submitting a pull request, coverage will also be collected across all supported
Python versions via the Codecov service, and will be reported back within the pull
Expand All @@ -179,7 +178,7 @@ Documentation
Docstrings for user-facing classes and functions should follow the `numpydoc
<https://numpydoc.readthedocs.io/en/latest/format.html#docstring-standard>`_ standard,
including sections for Parameters and Examples. All examples will be run as doctests
under Python 3.9.
under a stable version of Python.

NumCodecs uses Sphinx for documentation, hosted on readthedocs.org. Documentation is
written in the RestructuredText markup language (.rst files) in the ``docs`` folder.
Expand Down Expand Up @@ -207,8 +206,8 @@ Pull requests submitted by an external contributor should be reviewed and approv
one core developers before being merged. Ideally, pull requests submitted by a core developer
should be reviewed and approved by at least one other core developers before being merged.

Pull requests should not be merged until all CI checks have passed (Travis, AppVeyor,
Codecov) against code that has had the latest main merged in.
Pull requests should not be merged until all CI checks have passed (GitHub Actions,
CodeCov) against code that has had the latest main merged in.

Compatibility and versioning policies
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down
2 changes: 2 additions & 0 deletions numcodecs/blosc.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,8 @@ def get_mutex():
mutex = None
except ImportError:
mutex = None
except ModuleNotFoundError:
mutex = None
_MUTEX = mutex
_MUTEX_IS_INIT = True
return _MUTEX
Expand Down
4 changes: 4 additions & 0 deletions numcodecs/tests/common.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
import array
import json as _json
import os
import platform
import sys
from glob import glob

import numpy as np
Expand All @@ -26,6 +28,8 @@
'เฮลโลเวิลด์',
]

is_wasm = (sys.platform == 'emscripten') or (platform.machine() in ['wasm32', 'wasm64'])


def compare_arrays(arr, res, precision=None):
# ensure numpy array with matching dtype
Expand Down
2 changes: 2 additions & 0 deletions numcodecs/tests/test_blosc.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
check_err_decode_object_buffer,
check_err_encode_object_buffer,
check_max_buffer_size,
is_wasm,
)

codecs = [
Expand Down Expand Up @@ -223,6 +224,7 @@ def _decode_worker(enc):
return compressor.decode(enc)


@pytest.mark.skipif(is_wasm, reason="WASM/Pyodide does not support multiprocessing")
@pytest.mark.parametrize('pool', [Pool, ThreadPool])
def test_multiprocessing(use_threads, pool):
data = np.arange(1000000)
Expand Down
2 changes: 2 additions & 0 deletions numcodecs/tests/test_entrypoints_backport.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
import pytest

import numcodecs.registry
from numcodecs.tests.common import is_wasm

importlib_spec = importlib.util.find_spec("importlib_metadata")
if importlib_spec is None or importlib_spec.loader is None: # pragma: no cover
Expand All @@ -29,6 +30,7 @@ def get_entrypoints_with_importlib_metadata_loaded():
assert cls.codec_id == "test"


@pytest.mark.skipif(is_wasm, reason="Spawning processes is not supported in Pyodide/WASM")
def test_entrypoint_codec_with_importlib_metadata():
p = Process(target=get_entrypoints_with_importlib_metadata_loaded)
p.start()
Expand Down
2 changes: 2 additions & 0 deletions numcodecs/tests/test_shuffle.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
check_backwards_compatibility,
check_config,
check_encode_decode,
is_wasm,
)

codecs = [
Expand Down Expand Up @@ -87,6 +88,7 @@ def _decode_worker(enc):
return compressor.decode(enc)


@pytest.mark.skipif(is_wasm, reason="WASM/Pyodide does not support multiprocessing")
@pytest.mark.parametrize('pool', [Pool, ThreadPool])
def test_multiprocessing(pool):
data = np.arange(1000000)
Expand Down
13 changes: 13 additions & 0 deletions numcodecs/tests/test_zarr3.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@
import numpy as np
import pytest

from numcodecs.tests.common import is_wasm

if TYPE_CHECKING: # pragma: no cover
import zarr
else:
Expand Down Expand Up @@ -53,6 +55,7 @@ def test_docstring(codec_class: type[numcodecs.zarr3._NumcodecsCodec]):
assert "See :class:`numcodecs." in codec_class.__doc__ # type: ignore[operator]


@pytest.mark.skipif(is_wasm, reason="Threads are not supported in Pyodide/WASM")
@pytest.mark.parametrize(
"codec_class",
[
Expand Down Expand Up @@ -83,6 +86,8 @@ def test_generic_codec_class(store: StorePath, codec_class: type[numcodecs.zarr3
np.testing.assert_array_equal(data, a[:, :])


# TODO: undo skips here when we can test async code in WASM
@pytest.mark.skipif(is_wasm, reason="testing async code not yet supported in Pyodide/WASM")
@pytest.mark.parametrize(
("codec_class", "codec_config"),
[
Expand Down Expand Up @@ -123,6 +128,8 @@ def test_generic_filter(
np.testing.assert_array_equal(data, a[:, :])


# TODO: undo skips here when we can test async code in WASM
@pytest.mark.skipif(is_wasm, reason="testing async code not yet supported in Pyodide/WASM")
def test_generic_filter_bitround(store: StorePath):
data = np.linspace(0, 1, 256, dtype="float32").reshape((16, 16))

Expand All @@ -141,6 +148,8 @@ def test_generic_filter_bitround(store: StorePath):
assert np.allclose(data, a[:, :], atol=0.1)


# TODO: undo skips here when we can test async code in WASM
@pytest.mark.skipif(is_wasm, reason="testing async code not yet supported in Pyodide/WASM")
def test_generic_filter_quantize(store: StorePath):
data = np.linspace(0, 10, 256, dtype="float32").reshape((16, 16))

Expand All @@ -159,6 +168,8 @@ def test_generic_filter_quantize(store: StorePath):
assert np.allclose(data, a[:, :], atol=0.001)


# TODO: undo skips here when we can test async code in WASM
@pytest.mark.skipif(is_wasm, reason="testing async code not yet supported in Pyodide/WASM")
def test_generic_filter_packbits(store: StorePath):
data = np.zeros((16, 16), dtype="bool")
data[0:4, :] = True
Expand Down Expand Up @@ -188,6 +199,8 @@ def test_generic_filter_packbits(store: StorePath):
)


# TODO: undo skips here when we can test async code in WASM
@pytest.mark.skipif(is_wasm, reason="testing async code not yet supported in Pyodide/WASM")
@pytest.mark.parametrize(
"codec_class",
[
Expand Down
13 changes: 13 additions & 0 deletions numcodecs/zarr3.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,19 @@

from __future__ import annotations

# Short workaround for skipping the doctest above in a WASM environment
# compiled via Emscripten where threads are not available, and accessing
# the pytest config has no effect and ignoring warnings does not work.
try:
import pytest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would it take to not import pytest here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last time I checked, we would need a way to run this doctest and not raise a warning that gets converted to an error. The other way could be to not add this doctest here, which is not the best way to proceed.


from numcodecs.tests.common import is_wasm

if is_wasm:
pytest.skip("zarr3 doctests not supported in WASM", allow_module_level=True)
except (ImportError, ModuleNotFoundError): # not running tests
pass

import asyncio
import math
from dataclasses import dataclass, replace
Expand Down
34 changes: 34 additions & 0 deletions tools/ci/patches/0001-disable-multiprocessing-and-pthreads.patch
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also happy to upstream this patch (and the one below) to Blosc if it is suggested to do so. :)

Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
This patch disables multiprocessing and pthread for blosc. This file
is adapted from and attributed to the Pyodide developers and can be
viewed at the upstream Pyodide repository at the following link:

https://github.com/pyodide/pyodide/blob/d32e376013d8977b66c6aa828042b1fee8047aea/packages/numcodecs/patches/fixblosc.patch


diff --git a/c-blosc/blosc/blosc.h b/c-blosc/blosc/blosc.h
index 40857d0..8a1e969 100644
--- a/c-blosc/blosc/blosc.h
+++ b/c-blosc/blosc/blosc.h
@@ -50,7 +50,7 @@ extern "C" {
((INT_MAX - BLOSC_MAX_TYPESIZE * sizeof(int32_t)) / 3)

/* The maximum number of threads (for some static arrays) */
-#define BLOSC_MAX_THREADS 256
+#define BLOSC_MAX_THREADS 1

/* Codes for shuffling (see blosc_compress) */
#define BLOSC_NOSHUFFLE 0 /* no shuffle */

diff --git a/c-blosc/blosc/blosc.c b/c-blosc/blosc/blosc.c
index a5a5bd5..2a7797c 100644
--- a/c-blosc/blosc/blosc.c
+++ b/c-blosc/blosc/blosc.c
@@ -2236,6 +2236,7 @@ void blosc_atfork_child(void) {

void blosc_init(void)
{
+ g_initlib = 1;
/* Return if we are already initialized */
if (g_initlib) return;


Loading
Loading