-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Docs: generated by furiosa-runtime#1456
furiosa-runtime commit: 5b864736005496368a0f6b2dad49760642787215
- Loading branch information
1 parent
7fe5dd4
commit 25bf5d1
Showing
221 changed files
with
35,611 additions
and
0 deletions.
There are no files selected for viewing
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Large diffs are not rendered by default.
Oops, something went wrong.
1,080 changes: 1,080 additions & 0 deletions
1,080
v2025.1.0/en/_modules/furiosa_llm/artifact/builder.html
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
.. _CloudNativeToolkit: | ||
|
||
#################################### | ||
Cloud Native Toolkit | ||
#################################### | ||
|
||
FuriosaAI Cloud Native Toolkit is a software stack to enable FuriosaAI's NPU product in Kubernetes and Container ecosystem. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
.. _Kubernetes: | ||
|
||
#################################### | ||
Kubernetes Support | ||
#################################### | ||
|
||
We do support the following versions of Kubernetes and CRI runtime: | ||
|
||
* Kubernetes: v1.24.0 or later | ||
* helm v3.0.0 or later | ||
* CRI Runtime: `containerd <https://github.com/containerd/containerd>`_ or `CRI-O <https://github.com/cri-o/cri-o>`_ | ||
|
||
.. note:: | ||
|
||
Docker is officially deprecated as a container runtime in Kubernetes. | ||
It is recommended to use containerd or CRI-O as a container runtime. | ||
Otherwise you may face unexpected issues with the device plugin. | ||
For more information, see `here <https://kubernetes.io/blog/2020/12/02/dont-panic-kubernetes-and-docker/>`_. | ||
|
||
|
||
.. toctree:: | ||
:maxdepth: 1 | ||
:caption: Kubernetes Support | ||
|
||
/cloud_native_toolkit/kubernetes/feature_discovery | ||
/cloud_native_toolkit/kubernetes/device_plugin | ||
/cloud_native_toolkit/kubernetes/metrics_exporter | ||
/cloud_native_toolkit/kubernetes/scheduling_npus |
88 changes: 88 additions & 0 deletions
88
v2025.1.0/en/_sources/cloud_native_toolkit/kubernetes/device_plugin.rst
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,88 @@ | ||
.. _DevicePlugin: | ||
|
||
################################ | ||
Installing Furiosa Device Plugin | ||
################################ | ||
|
||
|
||
Furiosa Device Plugin | ||
================================================================ | ||
The Furiosa device plugin implements the `Kubernetes Device Plugin <https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/>`_ | ||
interface for FuriosaAI NPU devices, and its features are as follows: | ||
|
||
* Discovering the Furiosa NPU devices and registeriing to a Kubernetes cluster. | ||
* Tracking the health of the devices and reporting to a Kubernetes cluster. | ||
* Running AI workload on the top of the Furiosa NPU devices within a Kubernetes cluster. | ||
|
||
Configuration | ||
---------------------------------------------- | ||
The Furiosa NPU can be integrated into the Kubernetes cluster in various configurations. | ||
A single NPU card can either be exposed as a single resource or partitioned into multiple resources. | ||
Partitioning into multiple resources allows for more granular control. | ||
|
||
The configuration structure is as follows: | ||
|
||
.. code-block:: yaml | ||
config: | ||
resourceStrategy: generic | ||
debugMode: false | ||
disabledDeviceUUIDListMap: | ||
`resourceStrategy` defines the resource unit of NPU scheduling in the cluster. The following table shows the available resource strategy: | ||
|
||
.. list-table:: | ||
:align: center | ||
:widths: 200 200 200 | ||
:header-rows: 1 | ||
|
||
* - NPU Configuration | ||
- Resource Name | ||
- Resource Count Per Card | ||
* - generic | ||
- furiosa.ai/rngd | ||
- 1 | ||
* - single-core | ||
- furiosa.ai/rngd-1core.6gb | ||
- 8 | ||
* - dual-core | ||
- furiosa.ai/rngd-2core.12gb | ||
- 4 | ||
* - quad-core | ||
- furiosa.ai/rngd-4core.24gb | ||
- 2 | ||
|
||
`debugMode` enables or disables debug mode. The default value is `false`. | ||
|
||
`disabledDeviceUUIDListMap` allows disabling specific devices on a per-node basis. This is structured as follows: | ||
|
||
.. code-block:: yaml | ||
disabledDeviceUUIDListMap: | ||
node_a: | ||
- "uuid1" | ||
- "uuid2" | ||
node_b: | ||
- "uuid3" | ||
- "uuid4" | ||
If `disabledDeviceUUIDListMap` is not configured, all devices are enabled by default. | ||
|
||
|
||
Deploying Furiosa Device Plugin with Helm | ||
----------------------------------------- | ||
|
||
The Furiosa device plugin helm chart is available at https://github.com/furiosa-ai/helm-charts. To configure deployment as you need, you can modify ``charts/furiosa-device-plugin/values.yaml``. | ||
|
||
* If resourceStrategy is not specified, the default value is ``"generic"``. | ||
* If debugMode is not specified, the default value is ``false``. | ||
* If disabledDeviceUUIDListMap is not specified, the default value is empty list ``[]``. | ||
|
||
You can deploy the Furiosa Device Plugin by running the following commands: | ||
|
||
.. code-block:: sh | ||
helm repo add furiosa https://furiosa-ai.github.io/helm-charts | ||
helm repo update | ||
helm install furiosa-device-plugin furiosa/furiosa-device-plugin -n kube-system | ||
70 changes: 70 additions & 0 deletions
70
v2025.1.0/en/_sources/cloud_native_toolkit/kubernetes/feature_discovery.rst
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,70 @@ | ||
.. _FeatureDiscovery: | ||
|
||
#################################### | ||
Installing Furiosa Feature Discovery | ||
#################################### | ||
|
||
|
||
Furiosa Feature Discovery and NFD | ||
================================================================ | ||
|
||
The Furiosa Feature Discovery automatically labels Kubernetes nodes with information | ||
about FuriosaAI NPU properties, such as the NPU family, count, and driver versions. | ||
Using these labels, you can schedule your Kubernetes workloads based on specific NPU requirements. | ||
|
||
The Furiosa Feature Discovery leverage NFD(Node Feature Discovery) which is a tool that detects | ||
hardware features and labels Kubernetes nodes. It is recommended to use NFD and | ||
Furiosa Feature Discovery to ensure that the Cloud Native Toolkit is deployed only on nodes | ||
equipped with FuriosaAI NPUs. | ||
|
||
|
||
Labels | ||
----------------------------- | ||
|
||
The followings are the labels that the Furiosa Feature Discovery attaches and what they mean. | ||
|
||
.. list-table:: Labels | ||
:align: center | ||
:header-rows: 1 | ||
:widths: 130 160 260 | ||
|
||
* - Label | ||
- Value | ||
- Description | ||
* - furiosa.ai/npu.count | ||
- n | ||
- # of NPU devices | ||
* - furiosa.ai/npu.family | ||
- warboy, rngd | ||
- Chip family | ||
* - furiosa.ai/npu.product | ||
- warboy, rngd, rngd-s, rngd-max | ||
- Chip product name | ||
* - furiosa.ai/npu.driver.version | ||
- x.y.z | ||
- NPU device driver version | ||
* - furiosa.ai/npu.driver.version.major | ||
- x | ||
- NPU device driver version major part | ||
* - furiosa.ai/npu.driver.version.minor | ||
- y | ||
- NPU device driver version minor part | ||
* - furiosa.ai/npu.driver.version.patch | ||
- z | ||
- NPU device driver version patch part | ||
* - furiosa.ai/npu.driver.version.metadata | ||
- abcxyz | ||
- NPU device driver version metadata | ||
|
||
|
||
Deploying Furiosa Feature Discovery with Helm | ||
---------------------------------------------- | ||
With the helm chart you can easily install Furiosa feature discovery and NFD into your Kubernetes cluster. | ||
Following command shows how to install them. | ||
The Furiosa device plugin helm chart is available at https://github.com/furiosa-ai/helm-charts. To configure deployment as you need, you can modify ``charts/furiosa-feature-discovery/values.yaml``. | ||
|
||
.. code-block:: sh | ||
helm repo add furiosa https://furiosa-ai.github.io/helm-charts | ||
helm repo update | ||
helm install furiosa-feature-discovery furiosa/furiosa-feature-discovery -n kube-system |
135 changes: 135 additions & 0 deletions
135
v2025.1.0/en/_sources/cloud_native_toolkit/kubernetes/metrics_exporter.rst
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,135 @@ | ||
.. _MetricsExporter: | ||
|
||
################################### | ||
Installing Furiosa Metrics Exporter | ||
################################### | ||
|
||
|
||
Furiosa Metrics Exporter | ||
================================================================ | ||
The Furiosa metrics exporter exposes collection of metrics related to | ||
FuriosaAI NPU devices in `Prometheus <https://prometheus.io/>`_ format. | ||
In a Kubernetes cluster, you can scrape the metrics provided by furiosa-metrics-exporter | ||
using Prometheus and visualize them with a Grafana dashboard. | ||
This can be easily set up using the `Prometheus Chart <https://github.com/prometheus-community/helm-charts/tree/main/charts/prometheus>`_ | ||
and `Grafana <https://github.com/grafana/helm-charts/tree/main/charts/grafana>`_ | ||
Helm charts, along with the furiosa-metrics-exporter Helm chart. | ||
|
||
|
||
Metrics | ||
----------------------------------- | ||
The exporter is composed of chain of collectors, each collector is responsible | ||
for collecting specific metrics from the Furiosa NPU devices. | ||
The following table shows the available collectors and metrics: | ||
|
||
|
||
.. list-table:: NPU Metrics | ||
:align: center | ||
:widths: 100 100 100 100 200 | ||
:header-rows: 1 | ||
|
||
* - Collector Name | ||
- Metric | ||
- Type | ||
- Metric Labels | ||
- Description | ||
* - Liveness | ||
- furiosa_npu_alive | ||
- guage | ||
- arch, core, device, uuid, kubernetes_node_name | ||
- The liveness of the Furiosa NPU device. | ||
* - Temperature | ||
- furiosa_npu_hw_temperature | ||
- guage | ||
- arch, core, device, uuid, kubernetes_node_name, label | ||
- The temperature of the Furiosa NPU device. | ||
* - Power | ||
- furiosa_npu_hw_power | ||
- guage | ||
- arch, core, device, uuid, kubernetes_node_name, label | ||
- The power consumption of the Furiosa NPU device. | ||
* - Core Utilization | ||
- furiosa_npu_core_utilization | ||
- guage | ||
- arch, core, device, uuid, kubernetes_node_name | ||
- The core utilization of the Furiosa NPU device. | ||
|
||
All metrics share common metric labels such as arch, core, device, kubernetes_node_name, and uuid. | ||
The following table describes the common metric labels: | ||
|
||
.. list-table:: Common NPU Metrics Label | ||
:align: center | ||
:widths: 100 300 | ||
:header-rows: 1 | ||
|
||
* - Common Metric Label | ||
- Description | ||
* - arch | ||
- The architecture of the Furiosa NPU device. e.g. warboy, rngd | ||
* - core | ||
- The core number of the Furiosa NPU device. e.g. 0, 1, 2, 3, 4, 5, 6, 7, 0-1, 2-3, 0-3, 4-5, 6-7, 4-7, 0-7 | ||
* - device | ||
- The device name of the Furiosa NPU device. e.g. npu0 | ||
* - kubernetes_node_name | ||
- The name of the Kubernetes node where the exporter is running, this attribute can be missing if the exporter is running on the host machine or in a naked container. | ||
* - uuid | ||
- The UUID of the Furiosa NPU device. | ||
|
||
The metric label “label” is used to describe additional attributes specific to each metric. | ||
This approach helps avoid having too many metric definitions and effectively aggregates metrics that share common characteristics. | ||
|
||
.. list-table:: NPU Metrics Type | ||
:align: center | ||
:widths: 100 120 200 | ||
:header-rows: 1 | ||
|
||
* - Metric Type | ||
- Label Attribute | ||
- Description | ||
* - Temperature | ||
- peak | ||
- The highest temperature observed from SoC sensors | ||
* - Temperature | ||
- ambient | ||
- The temperature observed from sensors attached to the board | ||
* - Power | ||
- rms | ||
- Root Mean Square (RMS) value of the power consumed by the device, providing an average power consumption metric over a period of time. | ||
|
||
|
||
The following shows real-world example of the metrics: | ||
|
||
.. code-block:: sh | ||
#liveness | ||
furiosa_npu_alive{arch="rngd",core="0-7",device="npu0",kubernetes_node_name="node",uuid="uuid"} 1 | ||
#temperature | ||
furiosa_npu_hw_temperature{arch="rngd",core="0-7",device="npu0",kubernetes_node_name="node",label="peak",uuid="uuid"} 39 | ||
furiosa_npu_hw_temperature{arch="rngd",core="0-7",device="npu0",kubernetes_node_name="node",label="ambient",uuid="uuid"} 35 | ||
#power | ||
furiosa_npu_hw_power{arch="rngd",core="0-7",device="npu0",kubernetes_node_name="node",label="rms",uuid="uuid"} 4795000 | ||
#core utilization | ||
furiosa_npu_core_utilization{arch="rngd",core="0",device="npu0",kubernetes_node_name="node",uuid="uuid"} 90 | ||
furiosa_npu_core_utilization{arch="rngd",core="1",device="npu0",kubernetes_node_name="node",uuid="uuid"} 90 | ||
furiosa_npu_core_utilization{arch="rngd",core="2",device="npu0",kubernetes_node_name="node",uuid="uuid"} 90 | ||
furiosa_npu_core_utilization{arch="rngd",core="3",device="npu0",kubernetes_node_name="node",uuid="uuid"} 90 | ||
furiosa_npu_core_utilization{arch="rngd",core="4",device="npu0",kubernetes_node_name="node",uuid="uuid"} 90 | ||
furiosa_npu_core_utilization{arch="rngd",core="5",device="npu0",kubernetes_node_name="node",uuid="uuid"} 90 | ||
furiosa_npu_core_utilization{arch="rngd",core="6",device="npu0",kubernetes_node_name="node",uuid="uuid"} 90 | ||
furiosa_npu_core_utilization{arch="rngd",core="7",device="npu0",kubernetes_node_name="node",uuid="uuid"} 90 | ||
Deploying Furiosa Metrics Exporter with Helm | ||
--------------------------------------------------------- | ||
The Furiosa metrics exporter helm chart is available at https://github.com/furiosa-ai/helm-charts. To configure deployment as you need, you can modify ``charts/furiosa-metrics-exporter/values.yaml``. | ||
For example, the Furiosa metrics exporter Helm chart automatically creates a Service Object with Prometheus annotations to enable metric scraping automatically. You can modify the values.yaml to change the port or disable the Prometheus annotations if needed. | ||
You can deploy the Furiosa Metrics Exporter by running the following commands: | ||
|
||
.. code-block:: sh | ||
helm repo add furiosa https://furiosa-ai.github.io/helm-charts | ||
helm repo update | ||
helm install furiosa-metrics-exporter furiosa/furiosa-metrics-exporter -n kube-system | ||
Oops, something went wrong.