-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use sleap-io
video backend
#79
base: aadi/batch-inference-eval
Are you sure you want to change the base?
Conversation
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughThe changes involve refactoring how video files and configurations are managed across various modules. Key modifications include transitioning from list-based to dictionary-based video reader management, enhancing error handling for dataset retrieval, and shifting to batch processing for model evaluations. Additionally, input data normalization and improved logging practices were introduced to enhance clarity and robustness in handling datasets and model outputs. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Config
participant Model
participant Dataset
User->>Config: Request batch configuration
Config->>Model: Load hyperparameters from CSV
Model-->>User: Confirm hyperparameters loaded
User->>Dataset: Request dataset
Dataset-->>User: Return dataset or None
User->>Model: Start evaluation with dataset
Model-->>User: Return evaluation results
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Inline review comments failed to post
Actionable comments posted: 4
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (6)
- dreem/datasets/sleap_dataset.py (4 hunks)
- dreem/inference/eval.py (1 hunks)
- dreem/inference/track.py (2 hunks)
- dreem/io/config.py (6 hunks)
- dreem/models/embedding.py (1 hunks)
- dreem/models/gtr_runner.py (2 hunks)
Additional context used
Ruff
dreem/inference/eval.py
30-30: Use
key in dict
instead ofkey in dict.keys()
Remove
.keys()
(SIM118)
56-56: f-string without any placeholders
Remove extraneous
f
prefix(F541)
dreem/inference/track.py
100-100: Use
key in dict
instead ofkey in dict.keys()
Remove
.keys()
(SIM118)
dreem/datasets/sleap_dataset.py
163-163: Local variable
e
is assigned to but never usedRemove assignment to unused variable
e
(F841)
dreem/io/config.py
200-200: Undefined name
SleapDataset
(F821)
200-200: Undefined name
MicroscopyDataset
(F821)
200-200: Undefined name
CellTrackingDataset
(F821)
261-261: Undefined name
SleapDataset
(F821)
261-261: Undefined name
MicroscopyDataset
(F821)
261-261: Undefined name
CellTrackingDataset
(F821)
Additional comments not posted (7)
dreem/inference/track.py (1)
137-144
: Enhance logging for tracking and results saving.The logging statements provide valuable insights into the tracking process and results saving. Ensuring that these logs are informative and clear is crucial for debugging and user understanding.
The logging statements are well-placed and provide necessary information about the process flow.
dreem/models/gtr_runner.py (1)
Line range hint
304-325
: Simplify video name extraction and improve data storage logic.The changes to simplify the extraction of video names and the handling of ground truth track IDs are well-thought-out. These modifications enhance clarity and maintainability of the code. However, ensure that the new method of extracting
gt_track_id
is thoroughly tested to prevent any data integrity issues.The changes are appropriate and improve the code quality. Ensure thorough testing to confirm the functionality.
dreem/datasets/sleap_dataset.py (2)
109-109
: Refactor: Initializevid_readers
dictionary.The initialization of
self.vid_readers
as an empty dictionary is a good practice as it prepares the class to manage video readers efficiently. This change aligns with the shift from a list-based to a dictionary-based management system, enhancing performance and memory management.
370-370
: Proper Resource Management in Destructor.The update to the destructor method to iterate over
self.vid_readers
and close each reader is crucial for preventing resource leaks. This ensures that all video readers are properly closed before the object is garbage collected, aligning with best practices for resource management.dreem/models/embedding.py (1)
326-326
: Normalization Added to_learned_temp_embedding
.The addition of normalization to the
times
tensor by dividing it by its maximum value is a crucial enhancement. This ensures that the tensor is scaled between 0 and 1, which can prevent issues related to varying scales in the input data. This change is likely to improve the robustness and stability of the model's learning process.dreem/io/config.py (2)
Line range hint
200-257
: Enhanced Robustness inget_dataset
.The update to allow the
get_dataset
method to returnNone
if the dataset is empty is a significant improvement in error handling. This change ensures that the application can gracefully handle cases where no valid data is available, preventing potential errors in downstream processing. The logging of a warning when the dataset is empty is also a good practice, as it informs the user or developer about the issue.Tools
Ruff
261-261: Undefined name
SleapDataset
(F821)
261-261: Undefined name
MicroscopyDataset
(F821)
261-261: Undefined name
CellTrackingDataset
(F821)
Line range hint
261-313
: Improved Error Handling inget_dataloader
.The modifications to the
get_dataloader
method to handle cases where the dataset might beNone
or empty enhance the robustness of data handling. The addition of warnings when the dataset isNone
or empty is crucial for informing the user about potential issues, preventing errors during data loading. This change aligns with best practices in error handling and user feedback.
Comments failed to post (4)
dreem/inference/eval.py (2)
46-57: Ensure consistent logging and information display.
The logging statements are helpful for debugging and understanding the flow of batch processing. However, the use of an f-string without placeholders in line 56 is unnecessary and should be corrected.
Remove the extraneous
f
prefix from the logging statement to clean up the code:- logger.info(f"Using the following tracker:") + logger.info("Using the following tracker:")Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.hparams = {} checkpoint = eval_cfg.cfg.ckpt_path logger.info(f"Testing model saved at {checkpoint}") model = GTRRunner.load_from_checkpoint(checkpoint) model.tracker_cfg = eval_cfg.cfg.tracker model.tracker = Tracker(**model.tracker_cfg) logger.info("Using the following tracker:")
Tools
Ruff
56-56: f-string without any placeholders
Remove extraneous
f
prefix(F541)
29-44: Refactor batch configuration handling and improve error messages.
The changes to handle batch configurations and improve error handling are significant. The use of
cfg.keys()
should be simplified tokey in cfg
as suggested by static analysis. Additionally, the error message can be made more user-friendly by providing a clearer prompt when thePOD_INDEX
is not found.Apply these changes to improve code readability and error handling:
- if "batch_config" in cfg.keys(): + if "batch_config" in cfg: - input(f"{e}. Assuming single run!\nPlease input task index to run:") + input(f"Environment variable POD_INDEX not found. Assuming single run!\nPlease input task index to run:")Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.# update with parameters for batch train job if "batch_config" in cfg: try: index = int(os.environ["POD_INDEX"]) except KeyError as e: index = int( input("Environment variable POD_INDEX not found. Assuming single run!\nPlease input task index to run:") ) hparams_df = pd.read_csv(cfg.batch_config) hparams = hparams_df.iloc[index].to_dict() _ = hparams.pop("Unnamed: 0", None) if eval_cfg.set_hparams(hparams): logger.info("Updated the following hparams to the following values") logger.info(hparams)
Tools
Ruff
30-30: Use
key in dict
instead ofkey in dict.keys()
Remove
.keys()
(SIM118)
dreem/inference/track.py (1)
99-114: Refactor batch configuration handling and improve error messages.
The changes to handle batch configurations and improve error handling are significant. The use of
cfg.keys()
should be simplified tokey in cfg
as suggested by static analysis. Additionally, the error message can be made more user-friendly by providing a clearer prompt when thePOD_INDEX
is not found.Apply these changes to improve code readability and error handling:
- if "batch_config" in cfg.keys(): + if "batch_config" in cfg: - input(f"{e}. Assuming single run!\nPlease input task index to run:") + input(f"Environment variable POD_INDEX not found. Assuming single run!\nPlease input task index to run:")Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.# update with parameters for batch train job if "batch_config" in cfg: try: index = int(os.environ["POD_INDEX"]) except KeyError as e: index = int( input("Environment variable POD_INDEX not found. Assuming single run!\nPlease input task index to run:") ) hparams_df = pd.read_csv(cfg.batch_config) hparams = hparams_df.iloc[index].to_dict() _ = hparams.pop("Unnamed: 0", None) if pred_cfg.set_hparams(hparams): logger.info("Updated the following hparams to the following values") logger.info(hparams)
Tools
Ruff
100-100: Use
key in dict
instead ofkey in dict.keys()
Remove
.keys()
(SIM118)
dreem/datasets/sleap_dataset.py (1)
162-167: Error Handling Improvement in
get_instances
.The addition of a
try-except
block to handleFileNotFoundError
when accessing video frames is a significant improvement. This change ensures that the application can gracefully handle missing files and continue operation by attempting to load the video again. However, the local variablee
is assigned but never used, which is flagged by static analysis tools.Consider removing the assignment to
e
since it's not used, or log the exception for debugging purposes:- except FileNotFoundError as e: + except FileNotFoundError: + logger.error("File not found: {video_name}")Committable suggestion was skipped due to low confidence.
Tools
Ruff
163-163: Local variable
e
is assigned to but never usedRemove assignment to unused variable
e
(F841)
- Add support for specifying multiple train/val directories in a list, for data paths in config files. Currently only single directory is supported
…olab/dreem into aadi/sleap-io-video-backend
Here we switch to using sleap_io as the video backend for animal data
Summary by CodeRabbit
New Features
Bug Fixes
Refactor
Style