Skip to content

docs: Runconfig documentation #186

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Apr 30, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
233 changes: 233 additions & 0 deletions docs/runtime/runconfig.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,233 @@
# Runtime Configuration

`RunConfig` defines runtime behavior and options for agents in the ADK. It
controls speech and streaming settings, function calling, artifact saving, and
limits on LLM calls.

When constructing an agent run, you can pass a `RunConfig` to customize how the
agent interacts with models, handles audio, and streams responses. By default,
no streaming is enabled and inputs aren’t retained as artifacts. Use `RunConfig`
to override these defaults.

## Class Definition

The `RunConfig` class is a Pydantic model that enforces strict validation of
configuration parameters.

```python
class RunConfig(BaseModel):
"""Configs for runtime behavior of agents."""

model_config = ConfigDict(
extra='forbid',
)

speech_config: Optional[types.SpeechConfig] = None
response_modalities: Optional[list[str]] = None
save_input_blobs_as_artifacts: bool = False
support_cfc: bool = False
streaming_mode: StreamingMode = StreamingMode.NONE
output_audio_transcription: Optional[types.AudioTranscriptionConfig] = None
max_llm_calls: int = 500
```

## Runtime Parameters

| Parameter | Type | Default | Description |
| :------------------------------ | :------------------------------------------- | :--------------------- | :--------------------------------------------------------------------------------------------------------- |
| `speech_config` | `Optional[types.SpeechConfig]` | `None` | Configures speech synthesis (voice, language) via nested `types.SpeechConfig`. |
| `response_modalities` | `Optional[list[str]]` | `None` | List of desired output modalities (e.g., `["TEXT", "AUDIO"]`). Default is `None`. |
| `save_input_blobs_as_artifacts` | `bool` | `False` | If `True`, saves input blobs (e.g., uploaded files) as run artifacts for debugging/auditing. |
| `support_cfc` | `bool` | `False` | Enables Compositional Function Calling. Requires `streaming_mode=SSE` and uses the LIVE API. **Experimental.** |
| `streaming_mode` | `StreamingMode` | `StreamingMode.NONE` | Sets the streaming behavior: `NONE` (default), `SSE` (server-sent events), or `BIDI` (bidirectional). |
| `output_audio_transcription` | `Optional[types.AudioTranscriptionConfig]` | `None` | Configures transcription of generated audio output via `types.AudioTranscriptionConfig`. |
| `max_llm_calls` | `int` | `500` | Limits total LLM calls per run. `0` or negative means unlimited (warned); `sys.maxsize` raises `ValueError`. |

### `speech_config`

Speech configuration settings for live agents with audio capabilities. The
`SpeechConfig` class has the following structure:

```python
class SpeechConfig(_common.BaseModel):
"""The speech generation configuration."""

voice_config: Optional[VoiceConfig] = Field(
default=None,
description="""The configuration for the speaker to use.""",
)
language_code: Optional[str] = Field(
default=None,
description="""Language code (ISO 639. e.g. en-US) for the speech synthesization.
Only available for Live API.""",
)
```

The `voice_config` parameter uses the `VoiceConfig` class:

```python
class VoiceConfig(_common.BaseModel):
"""The configuration for the voice to use."""

prebuilt_voice_config: Optional[PrebuiltVoiceConfig] = Field(
default=None,
description="""The configuration for the speaker to use.""",
)
```

And `PrebuiltVoiceConfig` has the following structure:

```python
class PrebuiltVoiceConfig(_common.BaseModel):
"""The configuration for the prebuilt speaker to use."""

voice_name: Optional[str] = Field(
default=None,
description="""The name of the prebuilt voice to use.""",
)
```

These nested configuration classes allow you to specify:

* `voice_config`: The name of the prebuilt voice to use (in the `PrebuiltVoiceConfig`)
* `language_code`: ISO 639 language code (e.g., "en-US") for speech synthesis

When implementing voice-enabled agents, configure these parameters to control
how your agent sounds when speaking.

### `response_modalities`

Defines the output modalities for the agent. If not set, defaults to AUDIO.
Response modalities determine how the agent communicates with users through
various channels (e.g., text, audio).

### `save_input_blobs_as_artifacts`

When enabled, input blobs will be saved as artifacts during agent execution.
This is useful for debugging and audit purposes, allowing developers to review
the exact data received by agents.

### `support_cfc`

Enables Compositional Function Calling (CFC) support. Only applicable when using
StreamingMode.SSE. When enabled, the LIVE API will be invoked as only it
supports CFC functionality.

!!! warning

The `support_cfc` feature is experimental and its API or behavior might
change in future releases.

### `streaming_mode`

Configures the streaming behavior of the agent. Possible values:

* `StreamingMode.NONE`: No streaming; responses delivered as complete units
* `StreamingMode.SSE`: Server-Sent Events streaming; one-way streaming from server to client
* `StreamingMode.BIDI`: Bidirectional streaming; simultaneous communication in both directions

Streaming modes affect both performance and user experience. SSE streaming lets users see partial responses as they're generated, while BIDI streaming enables real-time interactive experiences.

### `output_audio_transcription`

Configuration for transcribing audio outputs from live agents with audio
response capability. This enables automatic transcription of audio responses for
accessibility, record-keeping, and multi-modal applications.

### `max_llm_calls`

Sets a limit on the total number of LLM calls for a given agent run.

* Values greater than 0 and less than `sys.maxsize`: Enforces a bound on LLM calls
* Values less than or equal to 0: Allows unbounded LLM calls *(not recommended for production)*

This parameter prevents excessive API usage and potential runaway processes.
Since LLM calls often incur costs and consume resources, setting appropriate
limits is crucial.

## Validation Rules

As a Pydantic model, `RunConfig` automatically validates parameter types.
In addition, it includes specific validation logic for the `max_llm_calls`
parameter:

1. If set to `sys.maxsize`, a `ValueError` is raised to prevent integer overflow issues
2. If less than or equal to 0, a warning is logged about potential unlimited LLM calls

## Examples

### Basic runtime configuration

```python
from google.genai.adk import RunConfig, StreamingMode

config = RunConfig(
streaming_mode=StreamingMode.NONE,
max_llm_calls=100
)
```

This configuration creates a non-streaming agent with a limit of 100 LLM calls,
suitable for simple task-oriented agents where complete responses are
preferable.

### Enabling streaming

```python
from google.genai.adk import RunConfig, StreamingMode

config = RunConfig(
streaming_mode=StreamingMode.SSE,
max_llm_calls=200
)
```
Using SSE streaming allows users to see responses as they're generated,
providing a more responsive feel for chatbots and assistants.

### Enabling speech support

```python
from google.genai.adk import RunConfig, StreamingMode
from google.genai import types

config = RunConfig(
speech_config=types.SpeechConfig(
language_code="en-US",
voice_config=types.VoiceConfig(
prebuilt_voice_config=types.PrebuiltVoiceConfig(
voice_name="Kore"
)
),
),
response_modalities=["AUDIO", "TEXT"],
save_input_blobs_as_artifacts=True,
support_cfc=True,
streaming_mode=StreamingMode.SSE,
max_llm_calls=1000,
)
```

This comprehensive example configures an agent with:

* Speech capabilities using the "Kore" voice (US English)
* Both audio and text output modalities
* Artifact saving for input blobs (useful for debugging)
* Experimental CFC support enabled
* SSE streaming for responsive interaction
* A limit of 1000 LLM calls

### Enabling Experimental CFC Support

```python
from google.genai.adk import RunConfig, StreamingMode

config = RunConfig(
streaming_mode=StreamingMode.SSE,
support_cfc=True,
max_llm_calls=150
)
```

Enabling Compositional Function Calling creates an agent that can dynamically
execute functions based on model outputs, powerful for applications requiring
complex workflows.
15 changes: 8 additions & 7 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -43,16 +43,16 @@ theme:
features:
- content.code.annotate
- content.code.copy
- content.tabs.link
- content.code.select
- content.tabs.link
- navigation.footer
- navigation.indexes
- navigation.instant
- navigation.instant.progress
- navigation.path
- navigation.top
- navigation.tracking
- navigation.indexes
- toc.follow
- navigation.footer

# Extensions
markdown_extensions:
Expand Down Expand Up @@ -126,6 +126,9 @@ nav:
- MCP tools: tools/mcp-tools.md
- OpenAPI tools: tools/openapi-tools.md
- Authentication: tools/authentication.md
- Running Agents:
- Agent Runtime: runtime/index.md
- Runtime Config: runtime/runconfig.md
- Deploy:
- deploy/index.md
- Agent Engine: deploy/agent-engine.md
Expand All @@ -136,14 +139,12 @@ nav:
- Session: sessions/session.md
- State: sessions/state.md
- Memory: sessions/memory.md
- Artifacts:
- artifacts/index.md
- Callbacks:
- callbacks/index.md
- Types of callbacks: callbacks/types-of-callbacks.md
- Callback patterns: callbacks/design-patterns-and-best-practices.md
- Runtime:
- runtime/index.md
- Artifacts:
- artifacts/index.md
- Events:
- events/index.md
- Context:
Expand Down