Results

https://fivetran.com/blog/warehouse-benchmark

Design

This is based on the TPC-DS benchmark, a standard data warehouse benchmark that uses lots of joins, aggregations and subqueries. The TPC-DS queries have been modified somewhat to improve portability across implementations, and eliminate the use of obscure SQL features like grouping-sets. We generated 1 TB of data, which contains about 4 billion rows in the largest fact table. We used the following warehouse configurations:

	Configuration	Cost / Hour
Redshift	5x ra3.4xlarge	$16.30
Snowflake	Large	$16.00
Presto	4x n2-highmem-32	$8.02
BigQuery	Flat-rate 500 slots	$13.70

Usage

These scripts are intended to be manually copy-pasted into various terminals. You can skip steps 1-4 since gs://fivetran-benchmark and s3://fivetran-benchmark are already populated.

Name	Name	Last commit message	Last commit date
Latest commit georgewfraser 2022 tweaks Dec 24, 2022 4de66ee · Dec 24, 2022 History 259 Commits
microsoft_sql	microsoft_sql	update azure sql	Sep 5, 2018
query	query	Presto runs	Feb 17, 2020
viz	viz	2022 tweaks	Dec 24, 2022
.gitignore	.gitignore	Presto runs	Feb 17, 2020
001-LaunchDataproc.sh	001-LaunchDataproc.sh	Rename files so they show up in the right order on Github	Sep 17, 2020
002-GenerateData.sh	002-GenerateData.sh	Rename files so they show up in the right order on Github	Sep 17, 2020
003-GenerateGs.sh	003-GenerateGs.sh	Rename files so they show up in the right order on Github	Sep 17, 2020
004-CopyToS3.sh	004-CopyToS3.sh	2022 tweaks	Dec 24, 2022
006-LaunchPresto.sh	006-LaunchPresto.sh	Rename files so they show up in the right order on Github	Sep 17, 2020
007-ConnectPresto.sh	007-ConnectPresto.sh	Rename files so they show up in the right order on Github	Sep 17, 2020
008-ConnectToEc2Instance.sh	008-ConnectToEc2Instance.sh	Rename files so they show up in the right order on Github	Sep 17, 2020
200-PopulateRedshift.sh	200-PopulateRedshift.sh	2022 tweaks	Dec 24, 2022
201-BenchmarkRedshift.sh	201-BenchmarkRedshift.sh	2022 tweaks	Dec 24, 2022
202-RedshiftTiming.sh	202-RedshiftTiming.sh	2022 tweaks	Dec 24, 2022
300-PopulateSnowflake.sh	300-PopulateSnowflake.sh	Switch to 1000 scale	Feb 20, 2020
301-BenchmarkSnowflake.sh	301-BenchmarkSnowflake.sh	Snowflake-100	Feb 19, 2020
302-SnowflakeTiming.sql	302-SnowflakeTiming.sql	Snowflake-100	Feb 19, 2020
400-PopulateBigQuery.sh	400-PopulateBigQuery.sh	2022 tweaks	Dec 24, 2022
401-BenchmarkBigQuery.sh	401-BenchmarkBigQuery.sh	2022 tweaks	Dec 24, 2022
500-BenchmarkAzure.sh	500-BenchmarkAzure.sh	Trying to use azure copy command	Jun 10, 2020
500-PopulateAzure.sql	500-PopulateAzure.sql	Trying to use azure copy command	Jun 10, 2020
502-AzureTiming.sql	502-AzureTiming.sql	Trying to use azure copy command	Jun 10, 2020
600-GenPopulateDatabricks.js	600-GenPopulateDatabricks.js	Add config	Feb 25, 2020
601-PopulateDatabricks.sql	601-PopulateDatabricks.sql	Add config	Feb 25, 2020
602-BenchmarkDatabricks.sh	602-BenchmarkDatabricks.sh	Databricks config	Feb 25, 2020
AzureQueryRunner.sh	AzureQueryRunner.sh	Azure commands	Sep 5, 2018
ForwardHttp.sh	ForwardHttp.sh	Larger nodes, lower CPU ratio	Feb 26, 2020
ForwardProfiler.sh	ForwardProfiler.sh	Name is random now	Dec 2, 2017
MicrosoftTools.sh	MicrosoftTools.sh	Microsoft tool install script	Aug 31, 2018
README.md	README.md	Correct link to results (fixers #8 )	Oct 17, 2020
Warmup.sql	Warmup.sql	BQ benchmarks	Jun 30, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Results

Design

Usage

About

Releases

Packages

Contributors 4

Languages

fivetran/benchmark

Folders and files

Latest commit

History

Repository files navigation

Results

Design

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages