-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix missing instance frame #108
Conversation
WalkthroughThe pull request improves the robustness of the Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant SleapDataset
participant Logger
Client->>SleapDataset: get_instances(label_idx, frame_idx)
Note over SleapDataset: Determine frame range from last label frame index
SleapDataset->>SleapDataset: Check if frame index is valid
alt Valid Frame
SleapDataset->>SleapDataset: Adjust frame index (subtract skipped_frame_ct)
SleapDataset->>Client: Return frame instances
else Missing Frame
SleapDataset->>Logger: Log warning about missing frame
SleapDataset->>Client: Return empty or adjusted instance list
end
Possibly related PRs
Poem
Tip 🌐 Web search-backed reviews and chat
✨ Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
dreem/datasets/sleap_dataset.py (2)
146-152
: LGTM! Consider adding docstring forskipped_frame_ct
.The change to use
labels[-1].frame_idx + 1
instead oflen(labels)
is a robust fix for handling missing frames. The addition ofskipped_frame_ct
helps track missing frames across batches.Add a docstring comment for the
skipped_frame_ct
attribute:self.frame_idx = [ torch.arange(labels[-1].frame_idx + 1) for labels in self.labels ] +# Counter for tracking the number of frames skipped due to missing instances +# for each label file self.skipped_frame_ct = [0 for labels in self.labels]
209-217
: Improve error handling and documentation.The frame index adjustment logic is correct, but the error handling and documentation could be enhanced.
Consider these improvements:
- # if slp is missing instances in some frames, frame_ind will be smaller than lf.frame_idx + # Adjust frame index by subtracting skipped frames to maintain correct indexing lf = video[frame_ind - self.skipped_frame_ct[label_idx]] if frame_ind < lf.frame_idx: logger.warning( - f"Frame index {frame_ind} is trying to access frame {lf.frame_idx} of the slp file {video_name}. " - f"This likely means there are no labelled instances in this frame. Skipping frame." + f"Frame index mismatch in {video_name}: " + f"Expected frame {frame_ind} but found frame {lf.frame_idx}. " + f"No labeled instances found. Skipping frame and incrementing skip counter." ) + # Increment skip counter to maintain correct indexing in subsequent frames self.skipped_frame_ct[label_idx] += 1 continue
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
dreem/datasets/sleap_dataset.py
(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: Tests (windows-2022, Python 3.9)
- GitHub Check: Tests (ubuntu-22.04, Python 3.9)
- GitHub Check: Lint
🔇 Additional comments (1)
dreem/datasets/sleap_dataset.py (1)
209-211
: Verify edge case: First frame missing.Please verify the behavior when the first frame is missing. The current implementation assumes
frame_ind < lf.frame_idx
catches missing frames, but we should ensure it works correctly whenframe_ind = 0
.Run this test to verify the edge case:
✅ Verification successful
I'll generate a new shell script to display the relevant code snippet from
dreem/datasets/sleap_dataset.py
(lines 200-220) so that we can verify the edge case handling whenframe_ind
is 0.
Edge Case Handling for Missing First Frame Verified
The code snippet confirms that when
frame_ind
is 0 and the correspondinglf.frame_idx
is greater than 0 (as expected when the first frame is missing), the conditionif frame_ind < lf.frame_idx:
will be met. In this case, the warning is logged, the skip counter is incremented, and processing continues appropriately.
- The condition correctly catches a missing first frame when
frame_ind == 0
.- The logic ensures proper warning and skip behavior when the frame is missing.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check for instances where the first frame (index 0) is missing # Look for frame_idx values in .slp files to verify handling of missing first frame ast-grep --pattern 'frame_idx = 0' | grep -B 2 -A 2Length of output: 136
Script:
#!/bin/bash # Display code snippet for dreem/datasets/sleap_dataset.py in the range covering the edge case (lines 200-220) sed -n '200,220p' dreem/datasets/sleap_dataset.pyLength of output: 918
This fix is known to cause bugs in certain situations e.g. if the frame index of the first frame in a slp file is not 0. The underlying issue has been resolved by #109. The code related to this patch has also been removed as it is redundant |
We are currently not checking for missing frames in the input .slp file to ensure alignment between expected number of frames from the video, and LabeledFrame count. We assume that frame_ind from chunked frames matches lf.frame_idx but it does not. The issue is that we index the labels based on enumerating the length of the slp Labels() object. We need to ensure the frame_ind in SleapDataset matches lf.frame_idx
This line highlights the source of the issue (len(labels)): self.frame_idx = [torch.arange(len(labels)) for labels in self.labels]
Fix: First, enumerate frame_idx using the lf.frame_idx value of the last frame in the slp file. This ensures the indexing stays correct till the end of the video. Then, since there will now be missing frame ids in slp, check that the chunk’s frame index is equal to lf.frame_idx, and if not, skip the frame. But, we need to keep track of how many frames we’ve skipped, otherwise even if frame_ind = lf.frame_idx, trying to index labels[frame_ind] will overshoot the correct frame! Also, since frames can be skipped across batches, the skip counter must be part of the SleapDataset state so that it persists throughout the entire video! (Rather than being reset every batch)
Summary by CodeRabbit