-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix empty batch #106
Fix empty batch #106
Conversation
…t-frame load failure causing empty batch (temp patch; needs to be integrated with sampler)
WalkthroughThe pull request introduces logging functionality to the Changes
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms (3)
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
dreem/datasets/base_dataset.py (1)
117-119
: Enhance warning message with more context.The warning message could be more informative by including the frame index and video details.
- print( - "Warning: Single frame batch; removing to avoid empty batch possibility with failed frame loading" - ) + print( + f"Warning: Removing single frame batch (frame {frame_chunk[0]}) " + f"to avoid empty batch possibility with failed frame loading" + )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
dreem/datasets/base_dataset.py
(2 hunks)dreem/inference/post_processing.py
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: Tests (windows-2022, Python 3.9)
- GitHub Check: Tests (ubuntu-22.04, Python 3.9)
- GitHub Check: Lint
🔇 Additional comments (3)
dreem/datasets/base_dataset.py (2)
147-147
: LGTM!The added blank line improves code readability.
113-124
: Verify data integrity after chunk removal.While this workaround prevents empty batch issues, removing chunks could potentially impact the training process by:
- Reducing the total amount of training data
- Creating gaps in sequential frame processing
- Potentially affecting model performance on edge cases
Let's analyze the potential impact:
✅ Verification successful
The chunk removal implementation is safe and well-controlled
The code specifically targets edge cases where:
- Only single-frame chunks are removed
- Only affects frames at clip boundaries
- Maintains dataset integrity by synchronizing frame and label removal
- Provides warning visibility for monitoring
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Analyze potential data loss from chunk removal # Look for patterns that might indicate frequent chunk removals # Search for frame processing patterns rg -A 5 "clip_length|chunked_frame_idx" # Look for related test cases fd -e py -x grep -l "test.*chunk" {}Length of output: 40834
Script:
#!/bin/bash # Search for test cases covering chunk removal and related error handling ast-grep --pattern 'def test_$_($$$): $$$ remove_idx $$$ $$$' # Look for test files containing assertions about chunk sizes rg -B 3 -A 3 "assert.*chunk" tests/ # Check for any logging or warning messages related to chunk removal rg -B 2 "Warning.*batch"Length of output: 1051
dreem/inference/post_processing.py (1)
163-164
: LGTM!The formatting change improves code readability without affecting functionality.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #106 +/- ##
==========================================
- Coverage 73.74% 73.42% -0.32%
==========================================
Files 36 36
Lines 2967 3165 +198
==========================================
+ Hits 2188 2324 +136
- Misses 779 841 +62 ☔ View full report in Codecov by Sentry. |
…read in. Largest batch that can be removed must be at most 10% of clip_len). Bug fix where loop breaks when collecting ixs to remove. Remove in reverse order to prevent list idx changing
Summary by CodeRabbit
Bug Fixes
Style
no_batching_fn
method.