Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RELEASE] ucxx v0.42 #362

Merged
merged 27 commits into from
Feb 13, 2025
Merged

[RELEASE] ucxx v0.42 #362

merged 27 commits into from
Feb 13, 2025

Conversation

AyodeAwe
Copy link
Contributor

❄️ Code freeze for branch-25.02 and v25.02 release

What does this mean?

Only critical/hotfix level issues should be merged into branch-25.02 until release (merging of this PR).

What is the purpose of this PR?

  • Update documentation
  • Allow testing for the new release
  • Enable a means to merge branch-25.02 into main for the release

raydouglass and others added 27 commits November 15, 2024 09:53
Forward-merge branch-0.41 into branch-0.42
Forward-merge branch-0.41 into branch-0.42
Forward-merge branch-0.41 into branch-0.42
Forward-merge branch-0.41 into branch-0.42
Forward-merge branch-0.41 into branch-0.42
This PR adapts to breaking changes in rmm in rapidsai/rmm#1722.

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Kyle Edwards (https://github.com/KyleFromNVIDIA)

URL: #336
`[[nodiscard]]` attribute does not have effect on `friend` declaration, and Clang compiler rejects it:
```
ucxx/cpp/include/ucxx/context.h:75:3: error: an attribute list cannot appear here
   75 |   [[nodiscard]] friend std::shared_ptr<Context> createContext(ConfigMap ucxConfig,
      |   ^~~~~~~~~~~~~
```

It also triggers `-Wattributes` in GCC:
```
ucxx/cpp/include/ucxx/context.h:75:49: warning: attribute ignored [-Wattributes]
   75 |   [[nodiscard]] friend std::shared_ptr<Context> createContext(ConfigMap ucxConfig,
      |                                                 ^~~~~~~~~~~~~
ucxx/cpp/include/ucxx/context.h:75:49: note: an attribute that appertains to a friend declaration that is not a definition is ignored
```

Move the `[[nodiscard]]` attribute to the non-friend declaration.

Authors:
  - Yuan Tong (https://github.com/tongyuantongyu)

Approvers:
  - Peter Andreas Entschev (https://github.com/pentschev)

URL: #339
This PR updates our pre-commit to use the latest version of cpplint. This brings in [my fix](cpplint/cpplint#269) that ensures that the pre-commit hook actually works in all environments. Without that fix, depending on the version of Python and setuptools in the base environment the cpplint installation may fail.

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)
  - Peter Andreas Entschev (https://github.com/pentschev)

Approvers:
  - Peter Andreas Entschev (https://github.com/pentschev)
  - Kyle Edwards (https://github.com/KyleFromNVIDIA)

URL: #338
Remove RMM logger CMake targets that are unused by UCXX, and thus prevent unnecessary linking to spdlog/fmt. Use a forward declaration for `rmm::device_buffer` to avoid symbols being added to the symbol table.

Authors:
  - Peter Andreas Entschev (https://github.com/pentschev)

Approvers:
  - Vyas Ramasubramani (https://github.com/vyasr)
  - Lawrence Mitchell (https://github.com/wence-)

URL: #343
Follow-up to #260.

Contributes to rapidsai/build-planning#33

Limits `libucxx` wheel-building to just running once per combination of `(CUDA version, CPU architecture)`... cutting out 8 unnecessary CI jobs per commit.

## Notes for Reviewers

### Why is this safe to do?

Unlike wheels that have Cython code, `libucxx` wheels don't depend on the Python minor version

https://github.com/rapidsai/ucxx/blob/ec860d901f944625e506d85adc0e08021fa4ffd4/python/libucxx/pyproject.toml#L48

e.g., they have tags like

```text
libucxx_cu12-0.42.0a18-py3-none-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl
```

Similar filters are being used for most C++ wheel builds across RAPIDS, e.g. https://github.com/rapidsai/cudf/blob/a95fbc88f94df24c3418766fbbea5b6633ff2328/.github/workflows/pr.yaml#L222-L230

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Peter Andreas Entschev (https://github.com/pentschev)
  - Mike Sarahan (https://github.com/msarahan)

URL: #344
Enable various C++ build-time warnings to improve code quality and treat them as errors moving forward. Additionally fix all of those entries that are now considered warnings.

Authors:
  - Peter Andreas Entschev (https://github.com/pentschev)

Approvers:
  - Vyas Ramasubramani (https://github.com/vyasr)
  - Lawrence Mitchell (https://github.com/wence-)

URL: #340
Bump `pynvml` from `11` to `12`. This version of `pynvml` also now depends on `nvidia-ml-py` for core functionality.

Authors:
  - https://github.com/jakirkham

Approvers:
  - James Lamb (https://github.com/jameslamb)
  - Peter Andreas Entschev (https://github.com/pentschev)

URL: #345
Contributes to rapidsai/build-planning#127

This PR cannot be merged unless nightly CI has passed within the past 7 days, so if it remains unmerged that will itself be an indication that nightly CI needs fixing.

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - James Lamb (https://github.com/jameslamb)

URL: #346
Improve `tagProbe` by accepting a tag mask for matching and return probed tag information. Expose also the sender endpoint handle to AM receive callback so that the callback is capable of knowing the origin of the message.

Additionally, fix C++ request tests that were being unintentionally skipped.

Authors:
  - Peter Andreas Entschev (https://github.com/pentschev)

Approvers:
  - Mads R. B. Kristensen (https://github.com/madsbk)

URL: #348
Retain a copy of headers for AM send requests as workaround for possible UCX bug openucx/ucx#10424 .

Unfortunately, reproducing this is not straightforward and it wasn't observed in a stack that can be made into UCXX tests currently, so testing this is not possible at the moment.

Authors:
  - Peter Andreas Entschev (https://github.com/pentschev)

Approvers:
  - Mads R. B. Kristensen (https://github.com/madsbk)

URL: #349
conda-forge is using GCC 13 for CUDA 12 builds. This PR updates CUDA 12 conda builds to use GCC 13, for alignment.

These PRs should be merged in a specific order, see rapidsai/build-planning#129 for details.

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - James Lamb (https://github.com/jameslamb)

URL: #347
Increase upper UCX pin to 1.19 to allow us supporting UCX 1.18 that is needed for RAPIDS 25.02.

Authors:
  - Peter Andreas Entschev (https://github.com/pentschev)

Approvers:
  - James Lamb (https://github.com/jameslamb)

URL: #352
Numba 0.61.0 just got released with couple of breaking changes, this pr is required to unblock the ci.

Authors:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - Peter Andreas Entschev (https://github.com/pentschev)
  - Ray Douglass (https://github.com/raydouglass)

URL: #351
There is no `0.59.1.1` version of numba, the intended change in #351 was `>=0.59.1,<0.61.0a0`

Authors:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #354
Remove the `std::mutex()` constructor call from list-initialization in `ucxx::Endpoint`. This is totally unnecessary and may cause issues with some compilers (or compiler options):

```
...
error: function "std::mutex::mutex(const std::mutex &)" (declared at line 94 of /opt/conda/envs/base/lib/gcc/x86_64-conda-linux-gnu/11.4.0/include/c++/bits/std_mutex.h) cannot be referenced -- it is a deleted function
```

Authors:
  - Peter Andreas Entschev (https://github.com/pentschev)

Approvers:
  - Sebastian Berg (https://github.com/seberg)
  - Mads R. B. Kristensen (https://github.com/madsbk)

URL: #355
This PR uses CUDA 12.8.0 to build and test.

xref: rapidsai/build-planning#139

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - James Lamb (https://github.com/jameslamb)

URL: #357
This PR points the shared workflow branches back to the default 25.02 branches.

xref: rapidsai/build-planning#139

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Bradley Dice (https://github.com/bdice)

URL: #360
@AyodeAwe AyodeAwe requested review from a team as code owners January 31, 2025 21:40
@AyodeAwe AyodeAwe requested review from a team as code owners January 31, 2025 21:40
@AyodeAwe AyodeAwe requested review from gforsyth and removed request for a team January 31, 2025 21:40
Copy link

copy-pr-bot bot commented Jan 31, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@AyodeAwe AyodeAwe merged commit cab76be into main Feb 13, 2025
1089 of 1096 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants