-
Notifications
You must be signed in to change notification settings - Fork 344
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(flow): get time window from datafusion logical plan #5570
base: main
Are you sure you want to change the base?
Conversation
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughThis pull request introduces a new variant in the flow module’s error handling to encapsulate Arrow library errors, updating the corresponding status mapping. It enhances test coverage by adding a new test case for evaluating time window lower bounds. Additionally, a new module named Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant RR as RecordingRules
participant QE as QueryEngine
participant QC as QueryContext
User->>RR: Call sql_to_df_plan(query, optimize)
RR->>QE: Parse SQL query
QE-->>RR: Return LogicalPlan
RR-->>User: Return LogicalPlan
User->>RR: Call find_plan_time_window_lower_bound(plan, current)
RR->>RR: Locate time window expression
RR->>QE: Execute binary search for lower bound
QE-->>RR: Return lower bound timestamp
RR-->>User: Return Optional Timestamp
Possibly related PRs
Suggested reviewers
Poem
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.
Comments suppressed due to low confidence (1)
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
src/flow/src/recording_rules.rs (3)
48-70
: Validate SQL parse failures further.Consider appending more detailed context or user-friendly hints when parsing fails—e.g., logging the exact failing SQL to assist in debugging parse errors in production.
72-198
: Handle multiple time window expressions more gracefully.Right now, the code returns on the first matching group expression referencing the timestamp column. If there's more than one matching expression, only the first is considered. Although this may be the intended behavior today, consider clarifying or adding a fallback strategy for future expansions that might handle multiple time window expressions in a single plan.
308-391
: Single-row evaluation overhead.
eval_ts_to_ts
creates a single-row record batch on each call. If many calls are needed (e.g., wide searches or repeated usage), this can be somewhat inefficient. Caching an in-memory query context or batching multiple timestamps might improve performance.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
src/flow/src/error.rs
(3 hunks)src/flow/src/expr/utils.rs
(1 hunks)src/flow/src/lib.rs
(1 hunks)src/flow/src/recording_rules.rs
(1 hunks)src/flow/src/test_utils.rs
(1 hunks)
🔇 Additional comments (9)
src/flow/src/recording_rules.rs (3)
1-14
: License header check.No issues found in the license headers. Good job including Apache 2.0 references and disclaimers.
200-306
: Exponential + binary search performance consideration.The exponential fallback and subsequent binary search for finding the lower bound is correct for monotonic time expressions. However, for extremely large time offsets, you could exceed typical time ranges or quickly jump to None. This is acceptable if you expect moderate ranges, but if large intervals become common, consider bounding the search or using a more direct approach (e.g., specialized domain knowledge for particular date bin intervals).
445-556
: Solid test coverage.Great use of table-driven tests that exercise different group by expressions and time windows. This approach is easy to extend or adapt for future expansions.
src/flow/src/lib.rs (1)
36-36
: Module import looks good.The new
recording_rules
module integrates seamlessly with the existing crate structure.src/flow/src/test_utils.rs (1)
89-90
: Time index schema update.Marking the "ts" column as a time index aligns with the logic in
find_plan_time_window_lower_bound
, enabling correct detection of time-window-based queries.src/flow/src/error.rs (3)
19-19
: LGTM!The import of
ArrowError
is correctly added to support the new error variant.
160-167
: LGTM!The new
Arrow
variant is well-structured and follows the same pattern as the existingDatafusion
variant, with appropriate fields for error source, context, and location.
243-243
: LGTM!The
Arrow
variant is correctly integrated into thestatus_code
method, mapping toStatusCode::Internal
which is appropriate for this type of error.src/flow/src/expr/utils.rs (1)
241-241
: LGTM!The test case modification maintains the existing test structure and coverage.
727403c
to
4d131f9
Compare
I hereby agree to the terms of the GreptimeDB CLA.
Refer to a related PR or issue link (optional)
What's changed and what's your intention?
as title, get time window from datafusion logical plan, exposing a
find_plan_time_window_lower_bound
for adjust time winodw of certain query, this is part of a larger feature and split for better reviewabilityPR Checklist
Please convert it to a draft if some of the following conditions are not met.
Summary by CodeRabbit
New Features
Tests
Chores