Skip to content

Latest commit

 

History

History
208 lines (156 loc) · 8.05 KB

README.md

File metadata and controls

208 lines (156 loc) · 8.05 KB

End-to-end regression tests

Regression tests are either run in Docker, using docker-compose to orchestrate the tests, or locally.

Prerequisites

It is recommended to clean the regtests/output directory before running tests. This can be done by running:

rm -rf ./regtests/output && mkdir -p ./regtests/output && chmod -R 777 ./regtests/output

Run Tests With Docker Compose

Tests can be run with docker-compose using the provided ./regtests/docker-compose.yml file, as follows:

./gradlew :polaris-quarkus-server:assemble -Dquarkus.container-image.build=true
docker compose -f ./regtests/docker-compose.yml up --build --exit-code-from regtest

In this setup, a Polaris container will be started in a docker-compose group, using the image previously built by the Gradle build. Then another container, including a Spark SQL shell, will run the tests. The exit code will be the same as the exit code of the Spark container.

This is the flow used in CI and should be done locally before pushing to GitHub to ensure that no environmental factors contribute to the outcome of the tests.

Important: if you are also using minikube, for example to test the Helm chart, you may need to unset the Docker environment that was pointing to the Minikube Docker daemon, otherwise the image will be built by the Minikube Docker daemon and will not be available to the local Docker daemon. This can be done by running, before building the image and running the tests:

eval $(minikube -p minikube docker-env --unset)

Run Tests Locally

Regression tests can be run locally as well, using the test harness.

In this setup, a Polaris server must be running on localhost:8181 before running tests. The simplest way to do this is to run the Polaris server in a separate terminal window:

./gradlew run

Note: the regression tests expect Polaris to run with certain options, e.g. with support for FILE storage, default realm POLARIS and root credentials root:secret; if you run the above command, this will be the case. If you run Polaris in a different way, make sure that Polaris is configured appropriately.

Running the test harness will automatically run the idempotent setup script. From the root of the project, just run:

env POLARIS_HOST=localhost ./regtests/run.sh

To run the tests in verbose mode, with test stdout printing to console, set the VERBOSE environment variable to 1; you can also choose to run only a subset of tests by specifying the test directories as arguments to run.sh. For example, to run only the t_spark_sql tests in verbose mode:

env VERBOSE=1 POLARIS_HOST=localhost ./regtests/run.sh t_spark_sql/src/spark_sql_basic.sh

Run with Cloud resources

Several tests require access to cloud resources, such as S3 or GCS. To run these tests, you must export the appropriate environment variables prior to running the tests. Each cloud can be enabled independently. Create a .env file that contains the following variables:

# AWS variables
AWS_TEST_ENABLED=true
AWS_ACCESS_KEY_ID=<your_access_key>
AWS_SECRET_ACCESS_KEY=<your_secret_key>
AWS_STORAGE_BUCKET=<your_s3_bucket>
AWS_ROLE_ARN=<iam_role_with_access_to_bucket>
AWS_TEST_BASE=s3://<your_s3_bucket>/<any_path>

# GCP variables
GCS_TEST_ENABLED=true
GCS_TEST_BASE=gs://<your_gcs_bucket>
GOOGLE_APPLICATION_CREDENTIALS=/tmp/credentials/<your_credentials.json>

# Azure variables
AZURE_TEST_ENABLED=true
AZURE_TENANT_ID=<your_tenant_id>
AZURE_DFS_TEST_BASE=abfss://<container-name>@<storage-account-name>.dfs.core.windows.net/<any_path>
AZURE_BLOB_TEST_BASE=abfss://<container-name>@<storage-account-name>.blob.core.windows.net/<any_path>

GOOGLE_APPLICATION_CREDENTIALS must be mounted to the container volumes. Copy your credentials file into the credentials folder. Then specify the name of the file in your .env file - do not change the path, as /tmp/credentials is the folder on the container where the credentials file will be mounted.

Fixing a failed test due to incorrect expected output

If a test fails due to incorrect expected output, the test harness will generate a script to help you compare the actual output with the expected output. The script will be located in the output directory, and will have the same name as the test, with the extension .fixdiffs.sh.

For example, if the test t_hello_world fails, the script to compare the actual and expected output will be located at output/t_hello_world/hello_world.sh.fixdiffs.sh:

Tue Apr 23 06:32:23 UTC 2024: Running all tests
Tue Apr 23 06:32:23 UTC 2024: Starting test t_hello_world:hello_world.sh
Tue Apr 23 06:32:23 UTC 2024: Test run concluded for t_hello_world:hello_world.sh
Tue Apr 23 06:32:23 UTC 2024: Test FAILED: t_hello_world:hello_world.sh
Tue Apr 23 06:32:23 UTC 2024: To compare and fix diffs: /tmp/polaris-regtests/t_hello_world/hello_world.sh.fixdiffs.sh
Tue Apr 23 06:32:23 UTC 2024: Starting test t_spark_sql:spark_sql_basic.sh
Tue Apr 23 06:32:32 UTC 2024: Test run concluded for t_spark_sql:spark_sql_basic.sh
Tue Apr 23 06:32:32 UTC 2024: Test SUCCEEDED: t_spark_sql:spark_sql_basic.sh

Simply execute the specified fixdiffs.sh file, which will in turn run meld and fix the ref file:

/tmp/polaris-regtests/t_hello_world/hello_world.sh.fixdiffs.sh

Then commit the changes to the ref file.

Run a spark-sql interactive shell

With a Polaris server running, you can run a spark-sql interactive shell to test. From the root of the project:

env POLARIS_HOST=localhost ./regtests/run_spark_sql.sh

Some SQL commands that you can try:

create database db1;
show databases;
create table db1.table1 (id int, name string);
insert into db1.table1 values (1, 'a');
select * from db1.table1;

Other commands are available in the regtests/t_spark_sql/src directory.

Python Tests

Python tests are based on pytest. They rely on a python Polaris client, which is generated from the openapi spec. The client can be generated using two commands:

# generate the management api client
docker run --rm \
  -v ${PWD}:/local openapitools/openapi-generator-cli generate \
  -i /local/spec/polaris-management-service.yml \
  -g python \
  -o /local/regtests/client/python --additional-properties=packageName=polaris.management --additional-properties=apiNamePrefix=polaris

# generate the iceberg rest client
docker run --rm \
  -v ${PWD}:/local openapitools/openapi-generator-cli generate \
  -i /local/spec/rest-catalog-open-api.yaml \
  -g python \
  -o /local/regtests/client/python --additional-properties=packageName=polaris.catalog --additional-properties=apiNameSuffix="" --additional-properties=apiNamePrefix=Iceberg

Tests rely on Python 3.8 or higher. pyenv can be used to install a current version and mapped to the local directory by using

pyenv install 3.8
pyenv local 3.8

Once you've done that, you can run setup.sh to generate a python virtual environment (installed at ~/polaris-venv) and download all of the test dependencies into it. From here, run.sh will be able to execute any pytest present.

To debug, setup IntelliJ to point at your virtual environment to find your test dependencies (see https://www.jetbrains.com/help/idea/configuring-python-sdk.html). Then run the test in your IDE.

The above is handled automatically when running reg tests from the docker image.