Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve config handling in cpp and python #644

Open
wants to merge 17 commits into
base: branch-25.04
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions cpp/examples/basic_io.cpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2021-2024, NVIDIA CORPORATION.
* Copyright (c) 2021-2025, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -124,7 +124,7 @@ int main()
check(a[i] == b[i]);
}
}
kvikio::defaults::thread_pool_nthreads_reset(16);
kvikio::defaults::set_thread_pool_nthreads(16);
{
std::cout << std::endl;
Timer timer;
Expand Down
45 changes: 22 additions & 23 deletions cpp/include/kvikio/defaults.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -99,14 +99,14 @@ class defaults {
[[nodiscard]] static CompatMode compat_mode();

/**
* @brief Reset the value of `kvikio::defaults::compat_mode()`.
* @brief Set the value of `kvikio::defaults::compat_mode()`.
*
* Changing the compatibility mode affects all the new FileHandles whose `compat_mode` argument is
* not explicitly set, but it never affects existing FileHandles.
*
* @param compat_mode Compatibility mode.
*/
static void compat_mode_reset(CompatMode compat_mode);
static void set_compat_mode(CompatMode compat_mode);

/**
* @brief Infer the `AUTO` compatibility mode from the system runtime.
Expand Down Expand Up @@ -157,7 +157,7 @@ class defaults {
*
* Notice, it is not possible to change the default thread pool. KvikIO will
* always use the same thread pool however it is possible to change number of
* threads in the pool (see `kvikio::default::thread_pool_nthreads_reset()`).
* threads in the pool (see `kvikio::default::set_thread_pool_nthreads()`).
*
* @return The default thread pool instance.
*/
Expand All @@ -166,81 +166,80 @@ class defaults {
/**
* @brief Get the number of threads in the default thread pool.
*
* Set the default value using `kvikio::default::thread_pool_nthreads_reset()` or by
* Set the default value using `kvikio::default::set_thread_pool_nthreads()` or by
* setting the `KVIKIO_NTHREADS` environment variable. If not set, the default value is 1.
*
* @return The number of threads.
*/
[[nodiscard]] static unsigned int thread_pool_nthreads();

/**
* @brief Reset the number of threads in the default thread pool. Waits for all currently running
* @brief Set the number of threads in the default thread pool. Waits for all currently running
* tasks to be completed, then destroys all threads in the pool and creates a new thread pool with
* the new number of threads. Any tasks that were waiting in the queue before the pool was reset
* will then be executed by the new threads. If the pool was paused before resetting it, the new
* pool will be paused as well.
* will then be executed by the new threads.
*
* @param nthreads The number of threads to use.
*/
static void thread_pool_nthreads_reset(unsigned int nthreads);
static void set_thread_pool_nthreads(unsigned int nthreads);

/**
* @brief Get the default task size used for parallel IO operations.
*
* Set the default value using `kvikio::default::task_size_reset()` or by setting
* Set the default value using `kvikio::default::set_task_size()` or by setting
* the `KVIKIO_TASK_SIZE` environment variable. If not set, the default value is 4 MiB.
*
* @return The default task size in bytes.
*/
[[nodiscard]] static std::size_t task_size();

/**
* @brief Reset the default task size used for parallel IO operations.
* @brief Set the default task size used for parallel IO operations.
*
* @param nbytes The default task size in bytes.
*/
static void task_size_reset(std::size_t nbytes);
static void set_task_size(std::size_t nbytes);

/**
* @brief Get the default GDS threshold, which is the minimum size to use GDS (in bytes).
*
* In order to improve performance of small IO, `.pread()` and `.pwrite()` implement a shortcut
* that circumvent the threadpool and use the POSIX backend directly.
*
* Set the default value using `kvikio::default::gds_threshold_reset()` or by setting the
* Set the default value using `kvikio::default::set_gds_threshold()` or by setting the
* `KVIKIO_GDS_THRESHOLD` environment variable. If not set, the default value is 1 MiB.
*
* @return The default GDS threshold size in bytes.
*/
[[nodiscard]] static std::size_t gds_threshold();

/**
* @brief Reset the default GDS threshold, which is the minimum size to use GDS (in bytes).
* @brief Set the default GDS threshold, which is the minimum size to use GDS (in bytes).
* @param nbytes The default GDS threshold size in bytes.
*/
static void gds_threshold_reset(std::size_t nbytes);
static void set_gds_threshold(std::size_t nbytes);

/**
* @brief Get the size of the bounce buffer used to stage data in host memory.
*
* Set the value using `kvikio::default::bounce_buffer_size_reset()` or by setting the
* Set the value using `kvikio::default::set_bounce_buffer_size()` or by setting the
* `KVIKIO_BOUNCE_BUFFER_SIZE` environment variable. If not set, the value is 16 MiB.
*
* @return The bounce buffer size in bytes.
*/
[[nodiscard]] static std::size_t bounce_buffer_size();

/**
* @brief Reset the size of the bounce buffer used to stage data in host memory.
* @brief Set the size of the bounce buffer used to stage data in host memory.
*
* @param nbytes The bounce buffer size in bytes.
*/
static void bounce_buffer_size_reset(std::size_t nbytes);
static void set_bounce_buffer_size(std::size_t nbytes);

/**
* @brief Get the maximum number of attempts per remote IO read.
*
* Set the value using `kvikio::default::http_max_attempts_reset()` or by setting
* Set the value using `kvikio::default::set_http_max_attempts()` or by setting
* the `KVIKIO_HTTP_MAX_ATTEMPTS` environment variable. If not set, the value is 3.
*
* @return The maximum number of remote IO reads to attempt before raising an
Expand All @@ -249,16 +248,16 @@ class defaults {
[[nodiscard]] static std::size_t http_max_attempts();

/**
* @brief Reset the maximum number of attempts per remote IO read.
* @brief Set the maximum number of attempts per remote IO read.
*
* @param attempts The maximum number of attempts to try before raising an error.
*/
static void http_max_attempts_reset(std::size_t attempts);
static void set_http_max_attempts(std::size_t attempts);

/**
* @brief The list of HTTP status codes to retry.
*
* Set the value using `kvikio::default::http_status_codes()` or by setting the
* Set the value using `kvikio::default::set_http_status_codes()` or by setting the
* `KVIKIO_HTTP_STATUS_CODES` environment variable. If not set, the default value is
*
* - 429
Expand All @@ -272,11 +271,11 @@ class defaults {
[[nodiscard]] static std::vector<int> const& http_status_codes();

/**
* @brief Reset the list of HTTP status codes to retry.
* @brief Set the list of HTTP status codes to retry.
*
* @param status_codes The HTTP status codes to retry.
*/
static void http_status_codes_reset(std::vector<int> status_codes);
static void set_http_status_codes(std::vector<int> status_codes);
};

} // namespace kvikio
14 changes: 7 additions & 7 deletions cpp/src/defaults.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -143,7 +143,7 @@ defaults* defaults::instance()
}
CompatMode defaults::compat_mode() { return instance()->_compat_mode; }

void defaults::compat_mode_reset(CompatMode compat_mode) { instance()->_compat_mode = compat_mode; }
void defaults::set_compat_mode(CompatMode compat_mode) { instance()->_compat_mode = compat_mode; }

CompatMode defaults::infer_compat_mode_if_auto(CompatMode compat_mode) noexcept
{
Expand All @@ -169,7 +169,7 @@ BS_thread_pool& defaults::thread_pool() { return instance()->_thread_pool; }

unsigned int defaults::thread_pool_nthreads() { return thread_pool().get_thread_count(); }

void defaults::thread_pool_nthreads_reset(unsigned int nthreads)
void defaults::set_thread_pool_nthreads(unsigned int nthreads)
{
if (nthreads == 0) {
throw std::invalid_argument("number of threads must be a positive integer greater than zero");
Expand All @@ -179,7 +179,7 @@ void defaults::thread_pool_nthreads_reset(unsigned int nthreads)

std::size_t defaults::task_size() { return instance()->_task_size; }

void defaults::task_size_reset(std::size_t nbytes)
void defaults::set_task_size(std::size_t nbytes)
{
if (nbytes == 0) {
throw std::invalid_argument("task size must be a positive integer greater than zero");
Expand All @@ -189,11 +189,11 @@ void defaults::task_size_reset(std::size_t nbytes)

std::size_t defaults::gds_threshold() { return instance()->_gds_threshold; }

void defaults::gds_threshold_reset(std::size_t nbytes) { instance()->_gds_threshold = nbytes; }
void defaults::set_gds_threshold(std::size_t nbytes) { instance()->_gds_threshold = nbytes; }

std::size_t defaults::bounce_buffer_size() { return instance()->_bounce_buffer_size; }

void defaults::bounce_buffer_size_reset(std::size_t nbytes)
void defaults::set_bounce_buffer_size(std::size_t nbytes)
{
if (nbytes == 0) {
throw std::invalid_argument(
Expand All @@ -204,15 +204,15 @@ void defaults::bounce_buffer_size_reset(std::size_t nbytes)

std::size_t defaults::http_max_attempts() { return instance()->_http_max_attempts; }

void defaults::http_max_attempts_reset(std::size_t attempts)
void defaults::set_http_max_attempts(std::size_t attempts)
{
if (attempts == 0) { throw std::invalid_argument("attempts must be a positive integer"); }
instance()->_http_max_attempts = attempts;
}

std::vector<int> const& defaults::http_status_codes() { return instance()->_http_status_codes; }

void defaults::http_status_codes_reset(std::vector<int> status_codes)
void defaults::set_http_status_codes(std::vector<int> status_codes)
{
instance()->_http_status_codes = std::move(status_codes);
}
Expand Down
14 changes: 11 additions & 3 deletions docs/source/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,16 @@ Defaults

.. autofunction:: compat_mode

.. autofunction:: compat_mode_reset
.. autofunction:: num_threads

.. autofunction:: get_num_threads
.. autofunction:: task_size

.. autofunction:: num_threads_reset
.. autofunction:: gds_threshold

.. autofunction:: bounce_buffer_size

.. autofunction:: http_status_codes

.. autofunction:: http_max_attempts

.. autofunction:: set
20 changes: 10 additions & 10 deletions docs/source/runtime_settings.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,37 +17,37 @@ Under ``AUTO``, KvikIO falls back to the compatibility mode:
* when running in Windows Subsystem for Linux (WSL).
* when ``/run/udev`` isn't readable, which typically happens when running inside a docker image not launched with ``--volume /run/udev:/run/udev:ro``.

This setting can also be programmatically controlled by :py:func:`kvikio.defaults.set_compat_mode` and :py:func:`kvikio.defaults.compat_mode_reset`.
This setting can also be programmatically accessed using :py:func:`kvikio.defaults.compat_mode` (getter) and :py:func:`kvikio.defaults.set` (setter).

Thread Pool ``KVIKIO_NTHREADS``
-------------------------------
KvikIO can use multiple threads for IO automatically. Set the environment variable ``KVIKIO_NTHREADS`` to the number of threads in the thread pool. If not set, the default value is 1.

This setting can also be controlled by :py:func:`kvikio.defaults.get_num_threads`, :py:func:`kvikio.defaults.num_threads_reset`, and :py:func:`kvikio.defaults.set_num_threads`.
This setting can also be accessed using :py:func:`kvikio.defaults.num_threads` (getter) and :py:func:`kvikio.defaults.set` (setter).

Task Size ``KVIKIO_TASK_SIZE``
------------------------------
KvikIO splits parallel IO operations into multiple tasks. Set the environment variable ``KVIKIO_TASK_SIZE`` to the maximum task size (in bytes). If not set, the default value is 4194304 (4 MiB).

This setting can also be controlled by :py:func:`kvikio.defaults.task_size`, :py:func:`kvikio.defaults.task_size_reset`, and :py:func:`kvikio.defaults.set_task_size`.
This setting can also be accessed using :py:func:`kvikio.defaults.task_size` (getter) and :py:func:`kvikio.defaults.set` (setter).

GDS Threshold ``KVIKIO_GDS_THRESHOLD``
--------------------------------------
In order to improve performance of small IO, ``.pread()`` and ``.pwrite()`` implement a shortcut that circumvent the threadpool and use the POSIX backend directly. Set the environment variable ``KVIKIO_GDS_THRESHOLD`` to the minimum size (in bytes) to use GDS. If not set, the default value is 1048576 (1 MiB).

This setting can also be controlled by :py:func:`kvikio.defaults.gds_threshold`, :py:func:`kvikio.defaults.gds_threshold_reset`, and :py:func:`kvikio.defaults.set_gds_threshold`.
This setting can also be accessed using :py:func:`kvikio.defaults.gds_threshold` (getter) and :py:func:`kvikio.defaults.set` (setter).

Size of the Bounce Buffer ``KVIKIO_BOUNCE_BUFFER_SIZE``
-------------------------------------------------------
KvikIO might have to use intermediate host buffers (one per thread) when copying between files and device memory. Set the environment variable ``KVIKIO_BOUNCE_BUFFER_SIZE`` to the size (in bytes) of these "bounce" buffers. If not set, the default value is 16777216 (16 MiB).

This setting can also be controlled by :py:func:`kvikio.defaults.bounce_buffer_size`, :py:func:`kvikio.defaults.bounce_buffer_size_reset`, and :py:func:`kvikio.defaults.set_bounce_buffer_size`.
This setting can also be accessed using :py:func:`kvikio.defaults.bounce_buffer_size` (getter) and :py:func:`kvikio.defaults.set` (setter).

#### HTTP Retries
-----------------
HTTP Retries ``KVIKIO_HTTP_STATUS_CODES``, ``KVIKIO_HTTP_MAX_ATTEMPTS``
------------------------------------------------------------------------

The behavior when a remote IO read returns a error can be controlled through the `KVIKIO_HTTP_STATUS_CODES` and `KVIKIO_HTTP_MAX_ATTEMPTS` environment variables.
The behavior when a remote I/O read returns an error can be controlled through the ``KVIKIO_HTTP_STATUS_CODES`` and ``KVIKIO_HTTP_MAX_ATTEMPTS`` environment variables.

`KVIKIO_HTTP_STATUS_CODES` controls the status codes to retry and can be controlled by :py:func:`kvikio.defaults.http_status_codes`, :py:func:`kvikio.defaults.http_status_codes_reset`, and :py:func:`kvikio.defaults.set_http_status_codes`.
KvikIO will retry a request should any of the HTTP status code in ``KVIKIO_HTTP_STATUS_CODES`` is received. The default values are ``429, 500, 502, 503, 504``. This setting can also be accessed using :py:func:`kvikio.defaults.http_status_codes` (getter) and :py:func:`kvikio.defaults.set` (setter).

`KVIKIO_HTTP_MAX_ATTEMPTS` controls the maximum number of attempts to make before throwing an exception and can be controlled by :py:func:`kvikio.defaults.http_max_attempts`, :py:func:`kvikio.defaults.http_max_attempts_reset`, and :py:func:`kvikio.defaults.set_http_max_attempts`.
The maximum number of attempts to make before throwing an exception is controlled by ``KVIKIO_HTTP_MAX_ATTEMPTS``. The default value is 3. This setting can also be accessed using :py:func:`kvikio.defaults.http_max_attempts` (getter) and :py:func:`kvikio.defaults.set` (setter).
Loading
Loading