Skip to content

Commit

Permalink
Document async_py_requirements added in ExecutionConfig for Execution…
Browse files Browse the repository at this point in the history
…Mode.AIRFLOW_ASYNC (astronomer#1545)

related: astronomer#1533 
related: astronomer#1544
  • Loading branch information
pankajkoti authored Feb 19, 2025
1 parent 6ddc3c2 commit bbcc9e3
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 2 deletions.
1 change: 1 addition & 0 deletions docs/configuration/execution-config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,4 @@ The ``ExecutionConfig`` class takes the following arguments:
- ``dbt_executable_path``: The path to the dbt executable for dag generation. Defaults to dbt if available on the path.
- ``dbt_project_path``: Configures the dbt project location accessible at runtime for dag execution. This is the project path in a docker container for ``ExecutionMode.DOCKER`` or ``ExecutionMode.KUBERNETES``. Mutually exclusive with ``ProjectConfig.dbt_project_path``.
- ``virtualenv_dir`` (new in v1.6): Directory path to locate the (cached) virtual env that should be used for execution when execution mode is set to ``ExecutionMode.VIRTUALENV``.
- ``async_py_requirements`` (new in v1.9): A list of Python packages to install when ``ExecutionMode.AIRFLOW_ASYNC`` (Experimental) is used. This parameter is required only when ``enable_setup_async_task`` and ``enable_teardown_async_task`` are set to ``True``. Example value: ``["dbt-postgres==1.9.0"]``.
9 changes: 7 additions & 2 deletions docs/getting_started/execution-modes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -304,15 +304,20 @@ You may observe that the compile task takes a bit longer to run due to the laten
remotely (e.g. for the classic ``jaffle_shop`` dbt project, upon compiling it produces about 31 files measuring about 124KB in total, but on a local
machine it took approximately 25 seconds for the task to compile & upload the compiled SQLs to the remote path).,
however, it is still a win as it is one-time overhead and the subsequent tasks run asynchronously utilising the Airflow's
deferrable operators and supplying to them those compiled SQLs.
deferrable operators and supplying to them those compiled SQLs. With this setup task, model tasks no longer require dbt
to be available or installed, eliminating the need to install dbt adapters in the same environment as the Airflow
installation. However, the virtual environment created during execution of the ``SetupAsyncOperator`` must install
the necessary dbt adapter for the setup task to function correctly. This can be achieved by specifying the required
dbt adapter in the ``async_py_requirements`` parameter within the ``ExecutionConfig`` of your ``DbtDag`` or ``DbtTaskGroup``.

Note that currently, the ``airflow_async`` execution mode has the following limitations and is released as **Experimental**:

1. **Airflow 2.8 or higher required**: This mode relies on Airflow's `Object Storage <https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/objectstorage.html>`__ feature, introduced in Airflow 2.8, to store and retrieve compiled SQLs.
2. **Limited to dbt models**: Only dbt resource type models are run asynchronously using Airflow deferrable operators. Other resource types are executed synchronously, similar to the local execution mode.
3. **BigQuery support only**: This mode only supports BigQuery as the target database. If a different target is specified, Cosmos will throw an error indicating the target database is unsupported in this mode.
4. **ProfileMapping parameter required**: You need to specify the ``ProfileMapping`` parameter in the ``ProfileConfig`` for your DAG. Refer to the example DAG below for details on setting this parameter.
6. **location parameter required**: You must specify the location of the BigQuery dataset in the ``operator_args`` of the ``DbtDag`` or ``DbtTaskGroup``. The example DAG below provides guidance on this.
5. **location parameter required**: You must specify the location of the BigQuery dataset in the ``operator_args`` of the ``DbtDag`` or ``DbtTaskGroup``. The example DAG below provides guidance on this.
6. **async_py_requirements parameter required**: If you're using the default approach of having a setup task, you must specify the necessary dbt adapter Python requirements based on your profile type for the async execution mode in the ``ExecutionConfig`` of your ``DbtDag`` or ``DbtTaskGroup``. The example DAG below provides guidance on this.

To start leveraging async execution mode that is currently supported for the BigQuery profile type targets you need to install Cosmos with the below additional dependencies:

Expand Down

0 comments on commit bbcc9e3

Please sign in to comment.