Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/trino on eks #441

Closed
wants to merge 34 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
4a28757
first push trino
youngjeong46 Jul 21, 2023
a7ac9e9
Merge branch 'main' into feature/trino-on-eks
youngjeong46 Jul 21, 2023
4cd8a73
Merge branch 'main' into feature/trino-on-eks
youngjeong46 Jul 27, 2023
1d83793
Merge remote-tracking branch 'origin/main' into feature/trino-on-eks
youngjeong46 Aug 1, 2023
f2b55cd
added trino with S3/glue first take
youngjeong46 Aug 1, 2023
116da60
Merge branch 'main' into feature/trino-on-eks
youngjeong46 Sep 28, 2023
94f8293
Merge remote-tracking branch 'origin/main' into feature/trino-on-eks
youngjeong46 Oct 31, 2023
e390fb2
working Iceberg/Hive
youngjeong46 Nov 21, 2023
65ee1c6
pre-commit fix
youngjeong46 Nov 21, 2023
9c56549
Merge branch 'main' into feature/trino-on-eks
youngjeong46 Nov 21, 2023
d457291
Merge branch 'main' into feature/trino-on-eks
youngjeong46 Nov 29, 2023
ed7bafb
Merge branch 'main' into feature/trino-on-eks
youngjeong46 Jan 29, 2024
9ff0044
working Trino with Hive or Iceberg
youngjeong46 Jan 31, 2024
8005e5b
revise trino iceberg connector config to include ingress
youngjeong46 Jan 31, 2024
a8dcff9
Website changes with Iceberg example
ashwinikumar-sa Feb 6, 2024
e83fec7
Added screenshot images
ashwinikumar-sa Feb 6, 2024
acb3fbf
Updated trino.md
ashwinikumar-sa Feb 6, 2024
2598cf1
Updated trino-iceberg.yaml for deploying Trino worker on EC2 Spot
ashwinikumar-sa Feb 11, 2024
f80c53e
Update node-pool.yaml to launch Karpenter nodes in single AZ from gen…
ashwinikumar-sa Feb 11, 2024
aee54d3
version upgrade for terraform eks, Karpenter, and combined hive and i…
youngjeong46 Feb 14, 2024
74a8680
Merge branch 'main' into feature/trino-on-eks
youngjeong46 Feb 14, 2024
0aa69dd
fixed cleanup script
youngjeong46 Feb 14, 2024
9ae75be
Updated hive-setup.sh to specify region in AWS CLI commands
ashwinikumar-sa Feb 16, 2024
e128e33
Update hive-cleanup.sh to specify region in AWS CLI commands
ashwinikumar-sa Feb 16, 2024
58d5e60
Updates trino.md with Fault-tolerant execution example
ashwinikumar-sa Feb 17, 2024
25da3b4
Trino screenshots for fault-tolerance example
ashwinikumar-sa Feb 17, 2024
f57fef6
Update trino.md
ashwinikumar-sa Feb 17, 2024
daaafb9
Update trino.md
ashwinikumar-sa Feb 17, 2024
d4060f2
rearranging some website instructions
youngjeong46 Feb 18, 2024
85b00b5
pre-commit fixes
youngjeong46 Feb 19, 2024
5d9a570
min tf version update + pre-commit fix for EKS
youngjeong46 Feb 19, 2024
ab1b63b
Merge branch 'main' into feature/trino-on-eks
youngjeong46 Feb 19, 2024
7ae856a
rev per PR comments
youngjeong46 Feb 20, 2024
a4915f2
pre-commit fix
youngjeong46 Feb 20, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
82 changes: 82 additions & 0 deletions distributed-databases/trino/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
# Trino on EKS
Checkout the [documentation website](https://awslabs.github.io/data-on-eks/docs/distributed-databases/trino) to deploy this pattern and run sample tests.

<!-- BEGINNING OF PRE-COMMIT-TERRAFORM DOCS HOOK -->
## Requirements

| Name | Version |
|------|---------|
| <a name="requirement_terraform"></a> [terraform](#requirement\_terraform) | >= 1.3.0 |
| <a name="requirement_aws"></a> [aws](#requirement\_aws) | >= 3.72 |
| <a name="requirement_helm"></a> [helm](#requirement\_helm) | >= 2.4.1 |
| <a name="requirement_kubectl"></a> [kubectl](#requirement\_kubectl) | >= 1.14 |
| <a name="requirement_kubernetes"></a> [kubernetes](#requirement\_kubernetes) | >= 2.10 |
| <a name="requirement_random"></a> [random](#requirement\_random) | 3.4.3 |

## Providers

| Name | Version |
|------|---------|
| <a name="provider_aws"></a> [aws](#provider\_aws) | >= 3.72 |
| <a name="provider_aws.ecr"></a> [aws.ecr](#provider\_aws.ecr) | >= 3.72 |
| <a name="provider_kubectl"></a> [kubectl](#provider\_kubectl) | >= 1.14 |
| <a name="provider_random"></a> [random](#provider\_random) | 3.4.3 |

## Modules

| Name | Source | Version |
|------|--------|---------|
| <a name="module_amp_ingest_irsa"></a> [amp\_ingest\_irsa](#module\_amp\_ingest\_irsa) | aws-ia/eks-blueprints-addon/aws | ~> 1.0 |
| <a name="module_ebs_csi_driver_irsa"></a> [ebs\_csi\_driver\_irsa](#module\_ebs\_csi\_driver\_irsa) | terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks | ~> 5.14 |
| <a name="module_eks"></a> [eks](#module\_eks) | terraform-aws-modules/eks/aws | ~> 20.0 |
| <a name="module_eks_aws_auth"></a> [eks\_aws\_auth](#module\_eks\_aws\_auth) | terraform-aws-modules/eks/aws//modules/aws-auth | ~> 20.0 |
| <a name="module_eks_blueprints_addons"></a> [eks\_blueprints\_addons](#module\_eks\_blueprints\_addons) | aws-ia/eks-blueprints-addons/aws | ~> 1.13 |
| <a name="module_s3_bucket"></a> [s3\_bucket](#module\_s3\_bucket) | terraform-aws-modules/s3-bucket/aws | ~> 3.0 |
| <a name="module_trino_addon"></a> [trino\_addon](#module\_trino\_addon) | aws-ia/eks-blueprints-addon/aws | ~> 1.1.1 |
| <a name="module_trino_exchange_bucket"></a> [trino\_exchange\_bucket](#module\_trino\_exchange\_bucket) | terraform-aws-modules/s3-bucket/aws | ~> 3.0 |
| <a name="module_trino_s3_bucket"></a> [trino\_s3\_bucket](#module\_trino\_s3\_bucket) | terraform-aws-modules/s3-bucket/aws | ~> 3.0 |
| <a name="module_vpc"></a> [vpc](#module\_vpc) | terraform-aws-modules/vpc/aws | ~> 5.0 |

## Resources

| Name | Type |
|------|------|
| [aws_iam_policy.grafana](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_policy) | resource |
| [aws_iam_policy.trino_exchange_bucket_policy](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_policy) | resource |
| [aws_iam_policy.trino_s3_bucket_policy](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_policy) | resource |
| [aws_prometheus_workspace.amp](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/prometheus_workspace) | resource |
| [aws_secretsmanager_secret.grafana](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/secretsmanager_secret) | resource |
| [aws_secretsmanager_secret_version.grafana](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/secretsmanager_secret_version) | resource |
| [kubectl_manifest.karpenter_resources](https://registry.terraform.io/providers/gavinbunney/kubectl/latest/docs/resources/manifest) | resource |
| [random_password.grafana](https://registry.terraform.io/providers/hashicorp/random/3.4.3/docs/resources/password) | resource |
| [aws_availability_zones.available](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/availability_zones) | data source |
| [aws_caller_identity.current](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/caller_identity) | data source |
| [aws_ecrpublic_authorization_token.token](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/ecrpublic_authorization_token) | data source |
| [aws_eks_cluster_auth.this](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/eks_cluster_auth) | data source |
| [aws_iam_policy.glue_full_access](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/iam_policy) | data source |
| [aws_iam_policy_document.grafana](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/iam_policy_document) | data source |
| [aws_iam_policy_document.trino_exchange_access](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/iam_policy_document) | data source |
| [aws_iam_policy_document.trino_s3_access](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/iam_policy_document) | data source |
| [aws_partition.current](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/partition) | data source |
| [aws_secretsmanager_secret_version.admin_password_version](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/secretsmanager_secret_version) | data source |
| [kubectl_path_documents.karpenter_resources](https://registry.terraform.io/providers/gavinbunney/kubectl/latest/docs/data-sources/path_documents) | data source |

## Inputs

| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| <a name="input_eks_cluster_version"></a> [eks\_cluster\_version](#input\_eks\_cluster\_version) | EKS Cluster version | `string` | `"1.29"` | no |
| <a name="input_enable_amazon_prometheus"></a> [enable\_amazon\_prometheus](#input\_enable\_amazon\_prometheus) | Enable AWS Managed Prometheus service | `bool` | `false` | no |
| <a name="input_name"></a> [name](#input\_name) | Name of the VPC and EKS Cluster | `string` | `"trino-on-eks"` | no |
| <a name="input_namespace"></a> [namespace](#input\_namespace) | Namespace for Trino | `string` | `"trino"` | no |
| <a name="input_region"></a> [region](#input\_region) | Region | `string` | `"us-west-2"` | no |
| <a name="input_trino_sa"></a> [trino\_sa](#input\_trino\_sa) | Service Account name for Trino | `string` | `"trino-sa"` | no |
| <a name="input_vpc_cidr"></a> [vpc\_cidr](#input\_vpc\_cidr) | VPC CIDR | `string` | `"10.1.0.0/16"` | no |

## Outputs

| Name | Description |
|------|-------------|
| <a name="output_configure_kubectl"></a> [configure\_kubectl](#output\_configure\_kubectl) | Configure kubectl: make sure you're logged in with the correct AWS profile and run the following command to update your kubeconfig |
| <a name="output_data_bucket"></a> [data\_bucket](#output\_data\_bucket) | Name of the S3 bucket to use for example Data |
<!-- END OF PRE-COMMIT-TERRAFORM DOCS HOOK -->
198 changes: 198 additions & 0 deletions distributed-databases/trino/addons.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,198 @@
#---------------------------------------------------------------
# IRSA for EBS CSI Driver
#---------------------------------------------------------------
module "ebs_csi_driver_irsa" {
source = "terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks"
version = "~> 5.14"
role_name = format("%s-%s", local.name, "ebs-csi-driver")
attach_ebs_csi_policy = true
oidc_providers = {
main = {
provider_arn = module.eks.oidc_provider_arn
namespace_service_accounts = ["kube-system:ebs-csi-controller-sa"]
}
}
tags = local.tags
}

#---------------------------------------------------------------
# Grafana Admin credentials resources
#---------------------------------------------------------------
data "aws_secretsmanager_secret_version" "admin_password_version" {
secret_id = aws_secretsmanager_secret.grafana.id
depends_on = [aws_secretsmanager_secret_version.grafana]
}

resource "random_password" "grafana" {
length = 16
special = true
override_special = "@_"
}

#tfsec:ignore:aws-ssm-secret-use-customer-key
resource "aws_secretsmanager_secret" "grafana" {
name_prefix = "${local.name}-grafana-"
recovery_window_in_days = 0 # Set to zero for this example to force delete during Terraform destroy
}

resource "aws_secretsmanager_secret_version" "grafana" {
secret_id = aws_secretsmanager_secret.grafana.id
secret_string = random_password.grafana.result
}

#---------------------------------------------------------------
# EKS Blueprints Addons
#---------------------------------------------------------------
module "eks_blueprints_addons" {
source = "aws-ia/eks-blueprints-addons/aws"
version = "~> 1.13"

cluster_name = module.eks.cluster_name
cluster_endpoint = module.eks.cluster_endpoint
cluster_version = module.eks.cluster_version
oidc_provider_arn = module.eks.oidc_provider_arn

#---------------------------------------
# Amazon EKS Managed Add-ons
#---------------------------------------
eks_addons = {
aws-ebs-csi-driver = {
service_account_role_arn = module.ebs_csi_driver_irsa.iam_role_arn
}
coredns = {
preserve = true
}
vpc-cni = {
preserve = true
}
kube-proxy = {
preserve = true
}
}

#---------------------------------------
# Kubernetes Add-ons
#---------------------------------------
#---------------------------------------------------------------
# CoreDNS Autoscaler helps to scale for large EKS Clusters
# Further tuning for CoreDNS is to leverage NodeLocal DNSCache -> https://kubernetes.io/docs/tasks/administer-cluster/nodelocaldns/
#---------------------------------------------------------------
enable_cluster_proportional_autoscaler = true
cluster_proportional_autoscaler = {
values = [templatefile("${path.module}/helm-values/coredns-autoscaler-values.yaml", {
target = "deployment/coredns"
})]
description = "Cluster Proportional Autoscaler for CoreDNS Service"
}

#---------------------------------------
# Metrics Server
#---------------------------------------
enable_metrics_server = true
metrics_server = {
values = [templatefile("${path.module}/helm-values/metrics-server-values.yaml", {})]
}

#---------------------------------------
# Karpenter Autoscaler for EKS Cluster
#---------------------------------------
enable_karpenter = true
karpenter_enable_spot_termination = true
karpenter_node = {
iam_role_use_name_prefix = false
iam_role_name = "${local.name}-karpenter-node"
iam_role_additional_policies = {
AmazonSSMManagedInstanceCore = "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"
}
}
karpenter = {
chart_version = "v0.34.0"
repository_username = data.aws_ecrpublic_authorization_token.token.user_name
repository_password = data.aws_ecrpublic_authorization_token.token.password
}

#---------------------------------------
# CloudWatch metrics for EKS
#---------------------------------------
enable_aws_cloudwatch_metrics = true
aws_cloudwatch_metrics = {
values = [templatefile("${path.module}/helm-values/aws-cloudwatch-metrics-values.yaml", {})]
}

#---------------------------------------
# Adding AWS Load Balancer Controller
#---------------------------------------
enable_aws_load_balancer_controller = true


#---------------------------------------
# AWS for FluentBit - DaemonSet
#---------------------------------------
enable_aws_for_fluentbit = true
aws_for_fluentbit_cw_log_group = {
use_name_prefix = false
name = "/${local.name}/aws-fluentbit-logs" # Add-on creates this log group
retention_in_days = 30
}
aws_for_fluentbit = {
s3_bucket_arns = [
module.s3_bucket.s3_bucket_arn,
"${module.s3_bucket.s3_bucket_arn}/*}"
]
values = [templatefile("${path.module}/helm-values/aws-for-fluentbit-values.yaml", {
region = local.region,
cloudwatch_log_group = "/${local.name}/aws-fluentbit-logs"
s3_bucket_name = module.s3_bucket.s3_bucket_id
cluster_name = module.eks.cluster_name
})]
}

#---------------------------------------
# Prommetheus and Grafana stack
#---------------------------------------
#---------------------------------------------------------------
# Install Kafka Montoring Stack with Prometheus and Grafana
# 1- Grafana port-forward `kubectl port-forward svc/kube-prometheus-stack-grafana 8080:80 -n kube-prometheus-stack`
# 2- Grafana Admin user: admin
# 3- Get admin user password: `aws secretsmanager get-secret-value --secret-id <output.grafana_secret_name> --region $AWS_REGION --query "SecretString" --output text`
#---------------------------------------------------------------
enable_kube_prometheus_stack = true
kube_prometheus_stack = {
values = [
var.enable_amazon_prometheus ? templatefile("${path.module}/helm-values/kube-prometheus-amp-enable.yaml", {
region = local.region
amp_sa = local.amp_ingest_service_account
amp_irsa = module.amp_ingest_irsa[0].iam_role_arn
amp_remotewrite_url = "https://aps-workspaces.${local.region}.amazonaws.com/workspaces/${aws_prometheus_workspace.amp[0].id}/api/v1/remote_write"
amp_url = "https://aps-workspaces.${local.region}.amazonaws.com/workspaces/${aws_prometheus_workspace.amp[0].id}"
}) : templatefile("${path.module}/helm-values/kube-prometheus.yaml", {})
]
chart_version = "48.1.1"
set_sensitive = [
{
name = "grafana.adminPassword"
value = data.aws_secretsmanager_secret_version.admin_password_version.secret_string
}
],
}

tags = local.tags
}

#---------------------------------------
# Karpenter Provisioners
#---------------------------------------
data "kubectl_path_documents" "karpenter_resources" {
pattern = "${path.module}/karpenter-resources/node-*.yaml"
vars = {
azs = local.region
eks_cluster_id = module.eks.cluster_name
}
}

resource "kubectl_manifest" "karpenter_resources" {
for_each = toset(data.kubectl_path_documents.karpenter_resources.documents)
yaml_body = each.value

depends_on = [module.eks_blueprints_addons]
}
Loading
Loading