Skip to content

Commit

Permalink
Update Admission Control docs for RACv2 (#19368)
Browse files Browse the repository at this point in the history
* Update Admission Control docs for RACv2

Fixes DOC-11670
  • Loading branch information
rmloveland authored Feb 14, 2025
1 parent 8e35769 commit b9938cd
Show file tree
Hide file tree
Showing 4 changed files with 34 additions and 25 deletions.
30 changes: 24 additions & 6 deletions src/current/v25.1/admission-control.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,25 +42,43 @@ Admission control can help if your cluster has degraded performance due to the f

Almost all database operations that use CPU or perform storage IO are controlled by the admission control system. From a user's perspective, specific operations that are affected by admission control include:

- [General SQL queries]({% link {{ page.version.version }}/selection-queries.md %}) have their CPU usage subject to admission control, as well as storage IO for writes to [leaseholder replicas]({% link {{ page.version.version }}/architecture/replication-layer.md %}#leases).
- [General SQL queries]({% link {{ page.version.version }}/selection-queries.md %}) have their CPU usage subject to admission control, as well as storage IO for writes to [leaseholder replicas]({% link {{ page.version.version }}/architecture/replication-layer.md %}#leases) and [follower replicas](#replication-admission-control).
- [Bulk data imports]({% link {{ page.version.version }}/import-into.md %}).
- [`COPY`]({% link {{ page.version.version }}/copy.md %}) statements.
- [Deletes]({% link {{ page.version.version }}/delete-data.md %}) (including deletes initiated by [row-level TTL jobs]({% link {{ page.version.version }}/row-level-ttl.md %}); the [selection queries]({% link {{ page.version.version }}/selection-queries.md %}) performed by TTL jobs are also subject to CPU admission control).
- [Backups]({% link {{ page.version.version }}/backup-and-restore-overview.md %}).
- [Schema changes]({% link {{ page.version.version }}/online-schema-changes.md %}), including index and column backfills (on both the [leaseholder replica]({% link {{ page.version.version }}/architecture/replication-layer.md %}#leases) and [follower replicas]({% link {{ page.version.version }}/architecture/replication-layer.md %}#raft)).
- [Follower replication work]({% link {{ page.version.version }}/architecture/replication-layer.md %}#raft).
- [Follower replication work](#replication-admission-control).
- [Raft log entries being written to disk]({% link {{ page.version.version }}/architecture/replication-layer.md %}#raft).
- [Changefeeds]({% link {{ page.version.version }}/create-and-configure-changefeeds.md %}).
- [Intent resolution]({% link {{ page.version.version }}/architecture/transaction-layer.md %}#write-intents).

The following operations are not subject to admission control:

- SQL writes are not subject to admission control on [follower replicas]({% link {{ page.version.version }}/architecture/replication-layer.md %}#raft) by default, unless those writes occur in transactions that are subject to a Quality of Service (QoS) level as described in [Set quality of service level for a session](#set-quality-of-service-level-for-a-session). In order for writes on follower replicas to be subject to admission control, the setting `default_transaction_quality_of_service=background` must be used.

{{site.data.alerts.callout_info}}
Admission control is beneficial when overall cluster health is good but some nodes are experiencing overload. If you see these overload scenarios on many nodes in the cluster, that typically means the cluster needs more resources.
{{site.data.alerts.end}}

### Replication admission control

The admission control subsystem paces all work done at the [Replication Layer]({% link {{ page.version.version }}/architecture/replication-layer.md %}#raft) to avoid cluster overload. This includes user-facing writes from SQL statements, as well as background (elastic) replication work.

{% include_cached new-in.html version="v25.1" %} The pacing of catchup writes is controlled at the replication layer to avoid overloading slow or [newly restarted nodes]({% link {{ page.version.version }}/eventlog.md %}#node_restart) with replication flows. Note that this pacing does not slow down user-facing SQL writes; it only ensures there are fewer impacts from background operations.

At a high level, replication admission control works by:

- Pacing regular SQL writes at the rate of replica quorum. (**New in v25.1**)
- Pacing background (elastic) replication at the rate of the slowest replica.

This has the following effects:

1. Does not overload slow/restarted nodes with replication flows. (**New in v25.1**)
2. Isolates performance between regular and elastic traffic.
3. Paces regular writes at quorum speed. (**New in v25.1**)
4. Paces elastic writes at the slowest replica's speed.

For example, prior to CockroachDB v25.1, when a leader and follower replica were disconnected from each other due to a node going away and coming back, once the follower came back the leader would slam the follower with any Raft entries it had missed. In v25.1 and later, the leader paces the entries it sends to the follower. The result is that baseline cluster QPS (queries per second) and latency should not change substantially during perturbations such as [node restarts]({% link {{ page.version.version }}/eventlog.md %}#node_restart).

To monitor the behavior of replication admission control, refer to [UI Overload Dashboard > Replication Admission Control]({% link {{ page.version.version }}/ui-overload-dashboard.md %}#admission-queueing-delay-p99-replication-admission-control).

## Enable and disable admission control

Admission control is enabled by default. To enable or disable admission control, use the following [cluster settings]({% link {{ page.version.version }}/cluster-settings.md %}):
Expand Down
4 changes: 2 additions & 2 deletions src/current/v25.1/critical-log-messages.md
Original file line number Diff line number Diff line change
Expand Up @@ -198,7 +198,7 @@ toc: true
- **Action**: Check for details of metrics in the message.
- **Related metrics**:
- `requests.slow.latch`: Number of requests that have been stuck for a long time acquiring latches. Latches moderate access to the KV keyspace for the purpose of evaluating and replicating commands. A slow latch acquisition attempt is often caused by another request holding and not releasing its latches in a timely manner. This in turn can either be caused by a long delay in evaluation (for example, under severe system overload) or by delays at the replication layer. This gauge registering a nonzero value usually indicates a serious problem and should be investigated.
- `requests.slow.raft`: Number of requests that have been stuck for a long time in the replication layer. An (evaluated) request has to pass through the replication layer, notably the quota pool and raft. If it fails to do so within a highly permissive duration, the gauge is incremented (and decremented again once the request is either applied or returns an error).
- `requests.slow.raft`: Number of requests that have been stuck for a long time in the replication layer. An (evaluated) request has to pass through the replication layer. If it fails to do so within a highly permissive duration, the gauge is incremented (and decremented again once the request is either applied or returns an error).
- `requests.slow.lease`: Number of requests that have been stuck for a long time acquiring a lease. This gauge registering a nonzero value usually indicates range or replica unavailability, and should be investigated. Often, you may also notice `requests.slow.raft` register a nonzero value, indicating that the lease requests are not getting a timely response from the replication layer.
- `requests.slow.distsender`: Number of range-bound RPCs currently stuck or retrying for a long time. Note that this is not a good signal for KV health. The remote side of the RPCs tracked here may experience contention, so it is easy to cause values for this metric to emit by leaving a transaction open for a long time and contending with it using a second transaction.
- `liveness.heartbeatfailures`: Number of failed node liveness heartbeats from this node
Expand Down Expand Up @@ -234,4 +234,4 @@ toc: true
- [Logging Best Practices]({% link {{ page.version.version }}/logging-best-practices.md %})
- [Troubleshoot Self-Hosted Setup]({% link {{ page.version.version }}/cluster-setup-troubleshooting.md %})
- [Common Errors and Solutions]({% link {{ page.version.version }}/common-errors.md %})
- [Essential Metrics for CockroachDB Self-Hosted Deployments]({% link {{ page.version.version }}/essential-metrics-self-hosted.md %})
- [Essential Metrics for CockroachDB Self-Hosted Deployments]({% link {{ page.version.version }}/essential-metrics-self-hosted.md %})
9 changes: 8 additions & 1 deletion src/current/v25.1/ui-overload-dashboard.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,11 +86,18 @@ This graph shows the 99th percentile latency of requests waiting in the [admissi

## Admission Queueing Delay p99 – Replication Admission Control

This graph shows the 99th percentile latency of requests waiting in the replication [admission control]({% link {{ page.version.version }}/admission-control.md %}) queue, as tracked by the `kvadmission.flow_controller.regular_wait_duration-p99` and the `kvadmission.flow_controller.elastic_wait_duration-p99` metrics. There are separate lines for regular flow token wait time and elastic (background) flow token wait time. These metrics are indicative of store overload on replicas.
This graph shows the 99th percentile latency of requests waiting in the replication [admission control]({% link {{ page.version.version }}/admission-control.md %}) queue, as tracked by the following metrics:

- `kvflowcontrol.eval_wait.regular.duration-p99`
- `kvflowcontrol.eval_wait.elastic.duration-p99`

There are separate lines for regular flow token wait time and elastic (background) flow token wait time. These metrics are indicative of store overload on replicas.

- In the node view, the graph shows the wait duration of regular flow token wait time and elastic flow token wait time on the selected node.
- In the cluster view, the graph shows the wait duration of regular flow token wait time and elastic flow token wait time across all nodes in the cluster.

For more information about how replication admission control works, refer to [Replication admission control]({% link {{ page.version.version }}/admission-control.md %}#replication-admission-control).

## Blocked Replication Streams

This graph shows the blocked replication streams per node in replication [admission control]({% link {{ page.version.version }}/admission-control.md %}), separated by admission priority {regular, elastic}, as tracked by the `kvadmission.flow_controller.regular_blocked_stream_count` and the `kvadmission.flow_controller.elastic_blocked_stream_count` metrics. There are separate lines for blocked regular streams and blocked elastic (background) streams.
Expand Down
16 changes: 0 additions & 16 deletions src/current/v25.1/ui-replication-dashboard.md
Original file line number Diff line number Diff line change
Expand Up @@ -171,22 +171,6 @@ Metric | Description
-------|------------
`{node}` | The rate of `ReplicaUnavailableError` events that have occurred per aggregated interval of time on that node since the `cockroach` process started.

## Paused Follower

<img src="{{ 'images/v24.2/ui_replica_paused_follower.png' | relative_url }}" alt="DB Console Paused Follower" style="border:1px solid #eee;max-width:100%" />

The **Paused Follower** graphs displays the number of nonessential replicas in the cluster that have replication paused. A value of `0` represents that a node is replicating as normal, while a value of `1` represents that replication has been paused for the listed node.

- In the node view, the graph shows whether replication has been paused, for the selected node.

- In the cluster view, the graph shows each node in the cluster and indicates whether replication has been paused for each node.

On hovering over the graph, the value for the following metric is displayed:

Metric | Description
-------|------------
`{node}` | Whether replication is paused on that node. A value of `0` represents that the node is replicating as normal, while a value of `1` represents that replication has been paused for the listed node.

## Replicate Queue Actions: Successes

<img src="{{ 'images/v24.2/ui_replica_queue_actions_successes.png' | relative_url }}" alt="DB Console Replicate Queue Actions: Successes" style="border:1px solid #eee;max-width:100%" />
Expand Down

0 comments on commit b9938cd

Please sign in to comment.