-
Notifications
You must be signed in to change notification settings - Fork 535
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: optimize ratelimit logs #2875
Conversation
|
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
📝 WalkthroughWalkthroughThis pull request introduces SQL migrations to add reversible index management on two ClickHouse tables. Two new indexes— Changes
Sequence Diagram(s)sequenceDiagram
participant C as Client
participant S as Ratelimit Service
participant DB as Database
C->>S: Request ratelimit logs with filters
S->>S: Build filter conditions and construct CTE (filtered_ratelimits)
S->>DB: Execute SQL query with LEFT JOIN on metrics data
DB-->>S: Return filtered results
S-->>C: Respond with ratelimit logs
Suggested reviewers
✨ Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Thank you for following the naming conventions for pull request titles! 🙏 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
internal/clickhouse/src/ratelimits.ts (1)
309-326
: LGTM! Well-structured CTE with optimized filtering.The CTE effectively organizes the filtering logic and should work well with the new indexes mentioned in the PR summary (
idx_workspace_time
andidx_request_id
).Consider adding a comment explaining the cursor-based pagination logic for better maintainability:
+ -- Cursor-based pagination using time and request_id as composite cursor AND (({cursorTime: Nullable(UInt64)} IS NULL AND {cursorRequestId: Nullable(String)} IS NULL) OR (time, request_id) < ({cursorTime: Nullable(UInt64)}, {cursorRequestId: Nullable(String)}))
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
internal/clickhouse/schema/048_raw_ratelimits_metrics_indexes_v1.sql
(1 hunks)internal/clickhouse/schema/049_raw_api_metrics_ratelimit_indexes_v1.sql
(1 hunks)internal/clickhouse/src/ratelimits.ts
(8 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (15)
- GitHub Check: Test Packages / Test ./packages/hono
- GitHub Check: Test Packages / Test ./packages/cache
- GitHub Check: Test Packages / Test ./packages/api
- GitHub Check: Test Packages / Test ./internal/clickhouse
- GitHub Check: Test Packages / Test ./internal/resend
- GitHub Check: Test Packages / Test ./internal/keys
- GitHub Check: Test Packages / Test ./internal/id
- GitHub Check: Test Packages / Test ./internal/hash
- GitHub Check: Test Packages / Test ./internal/encryption
- GitHub Check: Test Packages / Test ./internal/billing
- GitHub Check: Build / Build
- GitHub Check: Test API / API Test Local
- GitHub Check: Test Agent Local / test_agent_local
- GitHub Check: autofix
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (10)
internal/clickhouse/schema/049_raw_api_metrics_ratelimit_indexes_v1.sql (3)
1-4
: Goose Up Migration for API Metrics Table Indexes
The migration commands are correctly encapsulated with the goose marker for upgrades. The composite indexidx_workspace_time
on(workspace_id, time)
using theminmax
type with a granularity of 1 appears correctly defined. Please verify that your query patterns onmetrics.raw_api_requests_v1
actually benefit from this composite index.
5-7
: Additional Index for API Metrics Table
The addition of the single-column indexidx_request_id
onrequest_id
is defined appropriately. Reusing similar index types across tables helps maintain consistency, but ensure that this index aligns with the workload and query filtering requirements.
8-13
: Rollback Commands for API Metrics Table Indexes
The "goose down" section provides clear and reversible commands that drop the newly added indexes. It is important to double-check these commands in a staging environment to ensure that dropping these indexes does not unexpectedly degrade query performance.internal/clickhouse/schema/048_raw_ratelimits_metrics_indexes_v1.sql (3)
1-4
: Goose Up Migration for Ratelimits Table Indexes
This up migration correctly adds the composite indexidx_workspace_time
on(workspace_id, time)
for theratelimits.raw_ratelimits_v1
table. It follows a similar pattern to the metrics table, which promotes consistency across the codebase. Confirm that this index optimizes the queries related to ratelimit logs' performance.
5-7
: Additional Index for Ratelimits Table
The single-column indexidx_request_id
on therequest_id
column is properly defined with the minmax strategy and a granularity of 1. Having parallel indexing strategies across different tables can simplify maintenance. Verify that this change provides tangible query performance improvements.
8-13
: Rollback Migration for Ratelimits Table Indexes
The rollback commands for dropping the indexes seem straightforward and complete. It is a good practice to validate the rollback process in a test environment to ensure that the removal of indexes does not interrupt any critical query paths.internal/clickhouse/src/ratelimits.ts (4)
42-47
: LGTM! Type definitions are well-formatted.The type definitions have been reformatted for better readability while maintaining the same functionality.
270-284
: LGTM! Status condition construction is simplified.The status condition construction is now more readable and maintainable with clear null checks and boolean conversions.
285-302
: LGTM! Identifier conditions are well-structured.The identifier conditions are now more readable with clear null checks and proper operator handling.
346-363
: Verify metrics table join performance.The LEFT JOIN with the metrics table looks good, but consider these points:
- The join is only on
request_id
without additional time-based conditions- The subquery filters on
workspace_id
andtime
which should help performancePlease verify that:
- The join performance is acceptable with large datasets
- No duplicate records are produced when multiple metrics exist for the same request_id
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
internal/clickhouse/src/ratelimits.ts (3)
257-270
: Consider simplifying the status condition construction.The status condition construction can be simplified by using array methods more effectively.
- const statusCondition = !hasStatusFilters - ? "TRUE" - : args.status - ?.map((filter, index) => { - if (filter.operator === "is") { - const paramName = `statusValue_${index}`; - paramSchemaExtension[paramName] = z.boolean(); - parameters[paramName] = filter.value === "passed"; - return `passed = {${paramName}: Boolean}`; - } - return null; - }) - .filter(Boolean) - .join(" OR ") || "TRUE"; + const statusCondition = !hasStatusFilters + ? "TRUE" + : args.status + ?.filter(filter => filter.operator === "is") + .map((filter, index) => { + const paramName = `statusValue_${index}`; + paramSchemaExtension[paramName] = z.boolean(); + parameters[paramName] = filter.value === "passed"; + return `passed = {${paramName}: Boolean}`; + }) + .join(" OR ") || "TRUE";
272-289
: Consider using LIKE for better performance in identifier filtering.The
position
function might not be as performant asLIKE
for string pattern matching in ClickHouse.switch (p.operator) { case "is": return `identifier = {${paramName}: String}`; case "contains": - return `position({${paramName}: String}, identifier) > 0`; + return `identifier LIKE concat('%', {${paramName}: String}, '%')`; default: return null; }
332-349
: Consider optimizing the metrics subquery.The metrics subquery could benefit from the new indexes. Also, consider adding workspace_id to the JOIN condition for additional safety.
LEFT JOIN ( SELECT request_id, + workspace_id, host, method, path, request_headers, request_body, response_status, response_headers, response_body, service_latency, user_agent, colo FROM metrics.raw_api_requests_v1 WHERE workspace_id = {workspaceId: String} AND time BETWEEN {startTime: UInt64} AND {endTime: UInt64} - ) m ON fr.request_id = m.request_id + ) m ON fr.request_id = m.request_id AND fr.workspace_id = m.workspace_id
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
internal/clickhouse/src/ratelimits.ts
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (16)
- GitHub Check: Test Packages / Test ./packages/rbac
- GitHub Check: Test Packages / Test ./packages/nextjs
- GitHub Check: Test Packages / Test ./packages/cache
- GitHub Check: Test Packages / Test ./packages/api
- GitHub Check: Test Packages / Test ./internal/clickhouse
- GitHub Check: Test Packages / Test ./internal/resend
- GitHub Check: Test Packages / Test ./internal/keys
- GitHub Check: Test Packages / Test ./internal/id
- GitHub Check: Test Packages / Test ./internal/hash
- GitHub Check: Test Packages / Test ./internal/encryption
- GitHub Check: Test Packages / Test ./internal/billing
- GitHub Check: Test API / API Test Local
- GitHub Check: Test Agent Local / test_agent_local
- GitHub Check: Build / Build
- GitHub Check: autofix
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (3)
internal/clickhouse/src/ratelimits.ts (3)
253-255
: LGTM! Clear and descriptive boolean flags.The introduction of boolean flags improves code readability by making the filter presence checks explicit.
295-312
: LGTM! Well-structured CTE for improved readability.The introduction of the CTE improves query organization and maintainability. The filtering conditions are properly aligned with the new indexes on
workspace_id
andtime
.
350-351
: Verify index usage for ORDER BY clause.The ORDER BY clause on
time
andrequest_id
should leverage the new indexes.
What does this PR do?
Fixes # (issue)
If there is not an issue for this, please create one first. This is used to tracking purposes and also helps use understand why this PR exists
Type of change
How should this be tested?
Checklist
Required
pnpm build
pnpm fmt
console.logs
git pull origin main
Appreciated
Summary by CodeRabbit
New Features
Refactor