Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(core): Implement null-safe equality operator #3663

Merged
merged 12 commits into from
Jan 15, 2025

Conversation

f4t4nt
Copy link
Contributor

@f4t4nt f4t4nt commented Jan 10, 2025

Adds support for null-safe equality comparisons (<=>) in Daft, similar to MySQL's implementation. This operator treats NULL values specially:

  • NULL <=> NULL returns TRUE
  • NULL <=> non-NULL returns FALSE
  • non-NULL <=> non-NULL behaves like regular equals

Includes comprehensive test coverage for both expression-based and SQL-based usage.

@github-actions github-actions bot added the feat label Jan 10, 2025
@f4t4nt f4t4nt requested a review from kevinzwang January 10, 2025 23:59
Copy link

codspeed-hq bot commented Jan 11, 2025

CodSpeed Performance Report

Merging #3663 will degrade performances by 40.33%

Comparing nishant-null-safe-equality (631412c) with main (feab49a)

Summary

❌ 1 regressions
✅ 26 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark main nishant-null-safe-equality Change
test_iter_rows_first_row[100 Small Files] 148.5 ms 248.9 ms -40.33%

Copy link

codecov bot commented Jan 11, 2025

Codecov Report

Attention: Patch coverage is 64.10256% with 154 lines in your changes missing coverage. Please review.

Project coverage is 77.76%. Comparing base (c932ec9) to head (631412c).
Report is 8 commits behind head on main.

Files with missing lines Patch % Lines
src/daft-core/src/array/ops/comparison.rs 64.32% 147 Missing ⚠️
src/daft-dsl/src/expr/mod.rs 20.00% 4 Missing ⚠️
src/daft-stats/src/column_stats/comparison.rs 0.00% 3 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3663      +/-   ##
==========================================
- Coverage   78.06%   77.76%   -0.31%     
==========================================
  Files         728      728              
  Lines       89967    90346     +379     
==========================================
+ Hits        70236    70259      +23     
- Misses      19731    20087     +356     
Files with missing lines Coverage Δ
daft/expressions/expressions.py 93.54% <100.00%> (+0.02%) ⬆️
...onnect/src/translation/expr/unresolved_function.rs 74.46% <100.00%> (+0.27%) ⬆️
src/daft-core/src/series/ops/comparison.rs 95.00% <ø> (ø)
src/daft-dsl/src/python.rs 91.16% <100.00%> (+0.06%) ⬆️
src/daft-functions/src/list/list_fill.rs 96.40% <ø> (-0.06%) ⬇️
src/daft-sql/src/planner.rs 75.18% <100.00%> (+1.29%) ⬆️
src/daft-table/src/lib.rs 83.74% <100.00%> (+0.02%) ⬆️
src/daft-stats/src/column_stats/comparison.rs 72.41% <0.00%> (-1.93%) ⬇️
src/daft-dsl/src/expr/mod.rs 75.97% <20.00%> (-0.35%) ⬇️
src/daft-core/src/array/ops/comparison.rs 73.87% <64.32%> (-2.72%) ⬇️

... and 18 files with indirect coverage changes

@f4t4nt f4t4nt marked this pull request as ready for review January 11, 2025 01:50
@f4t4nt
Copy link
Contributor Author

f4t4nt commented Jan 11, 2025

The test coverage is low bc I'm not able to test FixedSizeBinary properly (see src/daft-core/src/array/ops/comparison.rs:2117, that print statement is never hit). Also, I am not able to test lit <=> col hence the

match (self.len(), rhs.len()) {
...
(1, r_size) => {

clauses are also not tested but this issue persists for regular equality so I have not seriously tried fixing it yet. All current tests are successful.

@kevinzwang kevinzwang requested review from desmondcheongzx and removed request for kevinzwang January 14, 2025 01:40
Copy link
Contributor

@desmondcheongzx desmondcheongzx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks mostly good to me! Some comments

src/daft-core/src/array/ops/binary.rs Outdated Show resolved Hide resolved
src/daft-functions/src/list/list_fill.rs Outdated Show resolved Hide resolved
daft/expressions/expressions.py Outdated Show resolved Hide resolved
src/daft-core/src/array/ops/comparison.rs Outdated Show resolved Hide resolved
tests/expressions/test_null_safe_equals.py Outdated Show resolved Hide resolved
tests/expressions/test_null_safe_equals.py Show resolved Hide resolved
src/daft-core/src/array/ops/comparison.rs Outdated Show resolved Hide resolved
src/daft-core/src/array/ops/comparison.rs Outdated Show resolved Hide resolved
src/daft-core/src/array/ops/comparison.rs Outdated Show resolved Hide resolved
src/daft-core/src/array/ops/comparison.rs Outdated Show resolved Hide resolved
Co-authored-by: Desmond Cheong <desmondcheongzx@gmail.com>
Copy link
Contributor

@desmondcheongzx desmondcheongzx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thanks for implementing this!

@f4t4nt f4t4nt merged commit 5702720 into main Jan 15, 2025
41 of 43 checks passed
@f4t4nt f4t4nt deleted the nishant-null-safe-equality branch January 15, 2025 19:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants