Skip to content

Commit

Permalink
Merge pull request #139 from databrickslabs/feature/v0.0.9
Browse files Browse the repository at this point in the history
Feature/v0.0.9
  • Loading branch information
pohlposition authored Feb 11, 2025
2 parents 95867fd + c1d2458 commit a234724
Show file tree
Hide file tree
Showing 113 changed files with 6,534 additions and 1,859 deletions.
1 change: 0 additions & 1 deletion .coveragerc
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@ omit =
src/install.py
src/uninstall.py
src/config.py
src/cli.py

[report]
exclude_lines =
Expand Down
2 changes: 1 addition & 1 deletion .flake8
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[flake8]
ignore = BLK100,E402,W503
exclude = .git,__pycache__,docs/source/conf.py,old,build,dist,dist,.eggs
exclude = .git,__pycache__,docs/source/conf.py,old,build,dist,dist,.eggs,integration_tests/notebooks/*/*.py,demo/notebooks/*/*.py,.venv
builtins = dlt,dbutils,spark,display,log_integration_test,pyspark.dbutils
max-line-length = 120
per-file-ignores =
Expand Down
26 changes: 23 additions & 3 deletions .github/workflows/onpush.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,10 +33,30 @@ jobs:

- name: Install coverage
run: pip install coverage


- name: Install psutil
run: pip install psutil

- name: Lint
run: flake8
run: flake8

- name: set spark local
run: export SPARK_LOCAL_IP=127.0.0.1

- name: set spark executor memory
run: export SPARK_EXECUTOR_MEMORY=8g

- name: set spark driver memory
run: export SPARK_DRIVER_MEMORY=8g

- name: set javaopts
run: export JAVA_OPTS="-Xmx10g -XX:+UseG1GC"

- name: Print System Information
run: |
python -c "import psutil; import os;
print(f'Physical Memory: {psutil.virtual_memory().total / 1e9:.2f} GB'); print(f'CPU Cores: {os.cpu_count()}')"
- name: Run Unit Tests
run: python -m coverage run

Expand All @@ -51,4 +71,4 @@ jobs:
name: codecov-umbrella
path_to_write_report: ./coverage/codecov_report.txt
verbose: true


8 changes: 6 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -150,8 +150,12 @@ deployment-merged.yaml
#IDE
.idea/
.vscode/

.databricks
.databricks-login.json
demo/conf/onboarding.json
integration_tests/conf/onboarding.json
integration_tests/conf/onboarding*.json
demo/conf/onboarding*.json
integration_test_output*.csv
databricks.yml
oboarding_job_details.json

3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,6 @@
[submodule "docs/themes/hugo-theme-learn"]
path = docs/themes/hugo-theme-learn
url = https://github.com/matcornic/hugo-theme-learn.git
[submodule "docs/themes/hugo-theme-relearn"]
path = docs/themes/hugo-theme-relearn
url = https://github.com/McShelby/hugo-theme-relearn.git
10 changes: 7 additions & 3 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,22 @@
# Changelog
## [v.0.0.8]
- Added dlt append_flow api support: [PR](https://github.com/databrickslabs/dlt-meta/pull/58)
## [v.0.0.9]
- Added apply_changes_from_snapshot api support in bronze layer: [PR](https://github.com/databrickslabs/dlt-meta/pull/124)
- Added dlt append_flow api support for silver layer: [PR](https://github.com/databrickslabs/dlt-meta/pull/63)
- Added support for file metadata columns for autoloader: [PR](https://github.com/databrickslabs/dlt-meta/pull/56)
- Added support for Bring your own custom transformation: [Issue](https://github.com/databrickslabs/dlt-meta/issues/68)
- Added support to Unify PyPI releases with GitHub OIDC: [PR](https://github.com/databrickslabs/dlt-meta/pull/62)
- Added demo for append_flow and file_metadata options: [PR](https://github.com/databrickslabs/dlt-meta/issues/74)
- Added Demo for silver fanout architecture: [PR](https://github.com/databrickslabs/dlt-meta/pull/83)
- Added documentation in docs site for new features: [PR](https://github.com/databrickslabs/dlt-meta/pull/64)
- Added hugo-theme-relearn themee: [PR](https://github.com/databrickslabs/dlt-meta/pull/132)
- Added unit tests to showcase silver layer fanout examples: [PR](https://github.com/databrickslabs/dlt-meta/pull/67)
- Added liquid cluster support: [PR](https://github.com/databrickslabs/dlt-meta/pull/136)
- Added support for UC Volume + Serverless support for CLI, Integration tests and Demos: [PR](https://github.com/databrickslabs/dlt-meta/pull/105)
- Added Chaining bronze/silver pipelines into single DLT: [PR](https://github.com/databrickslabs/dlt-meta/pull/130)
- Fixed issue for No such file or directory: '/demo' :[PR](https://github.com/databrickslabs/dlt-meta/issues/59)
- Fixed issue DLT-META CLI onboard command issue for Azure: databricks.sdk.errors.platform.ResourceAlreadyExists :[PR](https://github.com/databrickslabs/dlt-meta/issues/51)
- Fixed issue Changed dbfs.create to mkdirs for CLI: [PR](https://github.com/databrickslabs/dlt-meta/pull/53)
- Fixed issue DLT-META CLI should use pypi lib instead of whl : [PR](https://github.com/databrickslabs/dlt-meta/pull/79)
- Fixed issue Onboarding with multiple partition columns errors out: [PR](https://github.com/databrickslabs/dlt-meta/pull/134)

## [v.0.0.7]
- Added dlt-meta cli documentation and readme with browser support: [PR](https://github.com/databrickslabs/dlt-meta/pull/45)
Expand Down
16 changes: 15 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
<img src="https://img.shields.io/badge/DOCS-PASSING-green?style=for-the-badge" alt="Documentation Status"/>
</a>
<a href="https://pypi.org/project/dlt-meta/">
<img src="https://img.shields.io/badge/PYPI-v%200.0.8-green?style=for-the-badge" alt="Latest Python Release"/>
<img src="https://img.shields.io/badge/PYPI-v%200.0.9-green?style=for-the-badge" alt="Latest Python Release"/>
</a>
<a href="https://github.com/databrickslabs/dlt-meta/actions/workflows/onpush.yml">
<img src="https://img.shields.io/github/workflow/status/databrickslabs/dlt-meta/build/main?style=for-the-badge"
Expand Down Expand Up @@ -68,6 +68,20 @@ In practice, a single generic DLT pipeline reads the Dataflowspec and uses it to

![DLT-META Stages](./docs/static/images/dlt-meta_stages.png)

## DLT-META DLT Features support
| Features | DLT-META Support |
| ------------- | ------------- |
| Input data sources | Autoloader, Delta, Eventhub, Kafka |
| Medallion architecture layers | Bronze, Silver |
| Custom transformations | Bronze, Silver layer accepts custom functions|
| Data Quality Expecations Support | Bronze, Silver layer |
| Quarantine table support | Bronze layer |
| [apply_changes](https://docs.databricks.com/en/delta-live-tables/python-ref.html#cdc) API support | Bronze, Silver layer |
| [apply_changes_from_snapshot](https://docs.databricks.com/en/delta-live-tables/python-ref.html#change-data-capture-from-database-snapshots-with-python-in-delta-live-tables) API support | Bronze layer|
| Liquid cluster support | Bronze, Bronze Quarantine, Silver tables|
| [DLT-META CLI](https://databrickslabs.github.io/dlt-meta/getting_started/dltmeta_cli/) | ```databricks labs dlt-meta onboard```, ```databricks labs dlt-meta deploy``` |
| Bronze and Silver pipeline chaining | Deploy dlt-meta pipeline with ```layer=bronze_silver``` option using Direct publishing mode |

## Getting Started

Refer to the [Getting Started](https://databrickslabs.github.io/dlt-meta/getting_started)
Expand Down
Loading

0 comments on commit a234724

Please sign in to comment.