Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Databricks E2E] Changes to zip REST API and Integrate with ADF. #1041

Open
wants to merge 21 commits into
base: e2e/databricks/parking-sensors-V1
Choose a base branch
from

Conversation

bsherwin
Copy link
Contributor

Type of PR

  • Documentation changes
  • Code changes
  • Test changes

Purpose

This PR is to integrate the REST API into the Azure Data Factory process. We removed the binary ZIP package for deployment and build that as part of the deployment process.

In addition to modifying the Linked Services to use the new REST endpoints, we also modified the Databricks activity to use the ddo_cluster that was set up during the deployment. This allows the cluster to be shared rather than spinning up a cluster for each databricks task to "standardize" and "transform" the data.

Author pre-publish checklist

  • No PII in logs
  • Made corresponding changes to the documentation

Validation steps

  • Run a deployment from the devcontainer
  • In ADF, navigate to the Linked Services and validate that the Databricks linked service correctly references the ddo_cluster
  • Validate that the ADF pipeline "Copy Data" activities correctly point to the correct REST Endpoints
  • Validate that the ADF pipeline "Databricks" activities correctly point to the correct binaries and are correctly set to a Wheel file.

Issues Closed or Referenced

@bsherwin
Copy link
Contributor Author

@ydaponte pointed out that since we are using the existing cluster, we don't need the wheel as an appended library in the ADF Pipeline. I removed it as unnecessary.

Copy link
Collaborator

@ydaponte ydaponte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggesting some changes - there are 2 things that won't work. The shebang is missing on your app service bash script and the CI pipeline deploy-adf-job is going to fail because is referring to an old field of the databricks linked service definition. Thanks for the great work so far!

e2e_samples/parking_sensors/README.md Outdated Show resolved Hide resolved
e2e_samples/parking_sensors/README.md Show resolved Hide resolved
e2e_samples/parking_sensors/README.md Show resolved Hide resolved
e2e_samples/parking_sensors/src/ddo_transform/Dockerfile Outdated Show resolved Hide resolved
e2e_samples/parking_sensors/src/ddo_transform/Dockerfile Outdated Show resolved Hide resolved
@ydaponte ydaponte removed the request for review from thesqlpro January 23, 2025 13:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants