Skip to content

Releases: nv-legate/legate

v25.03.00

17 Mar 23:04
40d2963
Compare
Choose a tag to compare

Linux x86 and ARM conda packages with multi-node support (based on UCX or GASNet) are available for this release at https://anaconda.org/legate/legate (GASNet-based packages are under the gex label).

Documentation for this release can be found at https://docs.nvidia.com/legate/25.03/.

New features

Licensing

UX improvements

  • Stop passing default options to Nsight Systems when using the --nsys flag of the legate driver. Any non-default arguments are fully in the control of the user, through --nsys-extra.
  • Add the legate.core.ProfileRange Python context manager (and associated C++ API), to annotate sub-spans within a larger task span on the profiler visualization.

Documentation improvements

Deprecations

  • Variants no longer need to specify the size of their return value. Legate will compute this information automatically.

Miscellaneous

  • The TaskContext is now exposed to Python tasks.
  • Legate is now compatible with NumPy 2.x.
  • Provide a per-processor/per-GPU caching mechanism, useful e.g. for reusing CUDA library handles across tasks.

Full changelog: https://docs.nvidia.com/legate/25.03/changes/2503.html

Known issues

  • We are aware of possible performance regressions when using UCX 1.18. We are temporarily restricting our packages to UCX <= 1.17 while we investigate this.

v25.01.00

08 Feb 06:20
9fc6801
Compare
Choose a tag to compare

This is a closed-source release, governed by the following EULA: https://docs.nvidia.com/legate/25.01/eula.pdf.

Linux x86 and ARM conda packages with multi-node support (based on UCX or GASNet) are available at https://anaconda.org/legate/legate (GASNet-based packages are under the gex label).

Documentation for this release can be found at https://docs.nvidia.com/legate/25.01/.

New features

Memory management

  • There is no longer a separation between the memory pools used for ahead-of-task-execution ("deferred") allocations, and task-execution-time ("eager") allocations. The --eager-alloc-percentage flag is thus obsolete. Instead, a task that creates temporary or output buffers during execution must be registered with has_allocations=true, and a new allocation_pool_size() mapper callback must provide an upper bound for the task's total size of allocations. See https://docs.nvidia.com/legate/25.01/api/cpp/mapping.html for more detailed instructions.
  • Add the offload_to() API, that allows a user to offload a store or array to a particular memory kind, such that any copies in other memories are discarded. This can be useful e.g. to evict an array from GPU memory onto system memory, freeing up space for subsequent GPU tasks.

I/O

  • Move the HDF5 interface out of the experimental namespace.
  • Use cuFile to accelerate HDF5 reads on the GPU.
  • Add support for reading "binary" HDF5 datasets.

Deprecations

  • Remove the task_target() callback from the Legate mapper. Users should utilize the resource scoping mechanism instead, if they need to restrict where tasks should run.
  • Drop support for the Maxwell GPU architecture. Legate now requires at least Pascal (sm_60).

Miscellaneous

  • Increase the maximum array dimension from 4 to 6.
  • Record stacktraces on Legate exceptions and error messages.
  • Consider NUMA node topology when allocating CPU cores and memory during automatic machine configuration.
  • Add environment variable LEGATE_LIMIT_STDOUT, to only print out the output from one of the copies of the top-level program in a multi-process execution.
  • Add legate::LogicalStore::reinterpret_as() to reinterpret the underlying storage of a LogicalStore as another data-type.

Full changelog: https://docs.nvidia.com/legate/25.01/changes/2501.html

v24.11.01

07 Dec 04:10
29368dc
Compare
Choose a tag to compare

This is a closed-source release, governed by the following EULA: https://docs.nvidia.com/legate/24.11/eula.pdf.

Linux x86 and ARM conda packages with multi-node support (based on UCX or GASNet) are available at https://anaconda.org/legate/legate (GASNet-based packages are under the gex label).

Documentation for this release can be found at https://docs.nvidia.com/legate/24.11/.

New features

  • Bug fixes for release 24.11.00

v24.11.00

17 Nov 00:49
583cbc0
Compare
Choose a tag to compare

This is a closed-source release, governed by the following EULA: https://docs.nvidia.com/legate/24.11/eula.pdf.

Linux x86 and ARM conda packages with multi-node support (based on UCX or GASNet) are available at https://anaconda.org/legate/legate (GASNet-based packages are under the gex label).

Documentation for this release can be found at https://docs.nvidia.com/legate/24.11/.

New features

  • Provide an MPI wrapper, that the user can compile against their local MPI installation, and integrate with an existing build of Legate. This is useful when a user needs to use an MPI installation different from the one Legate was compiled against.
  • Add support for using GASNet as the networking backend, useful on platforms not currently supported by UCX, e.g. Slingshot11. Provide scripts for the user to compile GASNet on their local machine, and integrate with an existing build of Legate.
  • Automatic machine configuration; Legate will now detect the available hardware resources at startup, and no longer needs to be provided information such as the amount of memory to allocate.
  • Print more information on what data is taking up memory when Legate encounters an out-of-memory error.
  • Support scalar parameters, default arguments and reduction privileges in Python tasks.
  • Add support for a concurrent_task_barrier, useful in preventing NCCL deadlocks.
  • Allow tasks to specify that CUDA context synchronization at task exit can be skipped, reducing latency.
  • Experimental support for distributed hdf5 and zarr I/O.
  • Experimental support for single-CPU/GPU fast-path task execution (skipping the tasking runtime dependency analysis).
  • Experimental implementation of a "bloated" instance prefetching API, which instructs the runtime to create instances encompassing multiple slices of a store ahead of time, potentially reducing intermediate memory usage.
  • full changelog

Known issues

The GPUDirectStorage backend of the hdf5 I/O module (off by default, and enabled with LEGATE_IO_USE_VFD_GDS=1) is not currently working (enabling it will result in a crash). We are working on a fix.

Legate's auto-configuration heuristics will attempt to split CPU cores and system memory evenly across all instantiated OpenMP processors, not accounting for the actual core count and memory limits of each NUMA domain. In cases where the number of OpenMP groups does not evenly divide the number of NUMA domains, this bug may cause unsatisfiable core and memory allocations, resulting in error messages such as:

  • not enough cores in NUMA domain 0 (72 < 284)
  • reservation ('OMP0 proc 1d00000000000005 (worker 8)') cannot be satisfied
  • insufficient memory in NUMA node 4 (102533955584 > 102005473280 bytes) - skipping allocation

These issues should only affect performance if you are actually running computations on the OpenMP cores (rather than using the GPUs for computation). You can always adjust the automatically derived configuration values through LEGATE_CONFIG, see https://docs.nvidia.com/legate/latest/usage.html#resource-allocation.

v24.06.01

10 Sep 20:11
v24.06.01
19d55cf
Compare
Choose a tag to compare

This is a patch release, and includes the following fixes:

This is a closed-source release, governed by the following EULA: https://docs.nvidia.com/legate/24.06/eula.pdf. x86 conda packages with multi-node support (based on UCX) are available at https://anaconda.org/legate/legate-core.

Documentation for this release can be found at https://docs.nvidia.com/legate/24.06/.

v24.06.00

03 Jul 21:45
5dd4902
Compare
Choose a tag to compare

This release re-implements the Legate API in C++, which significantly reduces the overhead of the control code. This release also introduces the following major features:

  • As a result of the C++ re-implementation of the API, now the entire Legate program can be written in C++ (previously the control code had to be written in Python).
  • The Legate Array API, which extends Legate Stores with support for struct-type and nullable containers, and even containers of variable-length elements (e.g. string containers, and sparse array representations)
  • An implementation of STL algorithms based on the Legate API, which allows users to easily express common parallelism patterns without needing to write custom tasks.
  • Support for writing leaf tasks in Python (previously only leaf task implementations in C++ were supported)
  • Integration with NSight Systems (initial support)

This release bumps the minimum support CUDA version to 12.0.

This is a closed-source release, governed by the following EULA: https://docs.nvidia.com/legate/24.06/eula.pdf. x86 conda packages with multi-node support (based on UCX) are available at https://anaconda.org/legate/legate-core.

Documentation for this release can be found at https://docs.nvidia.com/legate/24.06/.

v23.11.00

17 Nov 23:49
fd45636
Compare
Choose a tag to compare

This release focuses on bugfixes and documentation improvements, in particular a formally documented support matrix.

Conda packages for this release are available at https://anaconda.org/legate/legate-core.

What's Changed

🛠️ Improvements

🐛 Bug Fixes

  • Avoid gc infinite loop at runtime destruction time by @manopapad in #842
  • Add missing 12.0 CUDA libraries to env generation script by @manopapad in #850
  • Set Mypy version downloaded in CI by @Jacobfaib in #859
  • Remove numpy from conda build dependencies. by @bdice in #855
  • Control ucx presence in install_info more carefully by @bryevdv in #882

📖 Documentation

New Contributors

Full Changelog: v23.09.00...v23.11.00

v23.09.00

03 Oct 15:24
21ea7b3
Compare
Choose a tag to compare

This release includes a number of bug fixes for multi-process execution, and quality-of-life improvements to the build system and driver script.

Conda packages for this release are available at https://anaconda.org/legate/legate-core.

What's Changed

🛠️ Improvements

📖 Documentation

  • Update CUDA Toolkit version in documentation by @ipdemes in #822

🐛 Bug Fixes

  • Pre-seed random number generators deterministically, to guard against control replication violations by @ipdemes in #809
  • Enable shard-local future creation for IO by @ipdemes in #835
  • Respect user-supplied PYTHONPATH by @bryevdv in #836
  • Use unordered detach operations by @ipdemes in #823
  • Fix oversubscription support in sharding functors by @ipdemes in #819
  • Respect the type of passed storage in create_store by @manopapad in #834

New Contributors

Full Changelog: v23.07.00...v23.09.00

v23.07.00

25 Jul 04:51
2b91db4
Compare
Choose a tag to compare

This release introduces support for resource scoping annotations, which allow parts of the program to be assigned to a subset of the available processors/GPUs. This release also includes some more examples of writing legate libraries, improved logging and safety checks, and a refactoring of legate.core's internals.

Conda packages for this release are available at https://anaconda.org/legate/legate-core.

What's Changed

🚀 New Features

🛠️ Improvements

📖 Documentation

🐛 Bug Fixes

New Contributors

Full Changelog: v23.03.00...v23.07.00

v23.03.00

15 Mar 20:02
3b1b245
Compare
Choose a tag to compare

This is the beta release of Legate Core.

This release focuses on making it easier for developers to get started building libraries on top of Legate Core, including features like updated API documentation, helper CMake functions for bootstrapping new Legate library projects, and a new "Hello World" library example, that demos the use of fundamental Legate API calls.

This release also adds support for using the standard python interpreter for running Legate programs (in addition to using the custom legate driver script).

Conda packages for this release are available at https://anaconda.org/legate/legate-core.

What's Changed

🐛 Bug Fixes

🚀 New Features

  • Default python interpreter support for Legate by @eddy16112 in #539
  • Build helper functions for legate projects, legate-hello example by @jjwilke in #571

🛠️ Improvements

📖 Documentation

New Contributors

Full Changelog: v23.01.00...v23.03.00