-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Questions about speed? #7
Comments
Thanks for your interest in our work. Generally, for a single scene on ScanNet, it will take 10~15 minutes for running 3d_prompt_proposal.py. So, the major issue may due to your hardware, not just the graphics card, but may mainly caused by your CPU or especailly the disk. Since our code requires to keep reading the small file (rgb frames) from the disk, I recommend you to store the rgb frames or other data on the SSD, and make sure that your CPU is also in good status. Also, you can try to use less frames by adjusting the frame gap, but may degrade the segmentation performance. Hope this is helpful! |
Thanks! I’ve stored the color and depth files as your code specifies. The key factor may be the number of frames in the ScanNet dataset, which ranges from over 1,000 to more than 5,000 frames. I can process around 1,000 frames in approximately 15 minutes, but it takes about an hour to process 5,000 frames. Considering the significant overlap of image frames, adjusting the frame gap based on the total number of frames might help. Unfortunately, there is no evaluation code for calculating mIoU, so the loss cannot be directly interpreted. Do you have any insights on how to select the optimal number of frames? |
Hi, Skipping 5 frames would be OK to produce a fair result. For 10, 20 would somehow degrades the performance. Also, as for how to evaluate the result, currently I can only suggest you to check the result visualizations by yourself. According to my experience, skipping 5 frames will also produce a visually-good result. |
Thanks for your reply! For visualization, I’m unable to visualize all scenes and review them in detail, which might make it challenging to identify key factors. Additionally, I am working on leveraging CLIP and SAM to obtain standard semantic segmentation results. Your aggregation method has truly been an inspiration. Thanks again! |
Thanks for your interesting work!
However, I found that I run 3d_prompt_proposal.py and main.py too slow for even one scene on ScanNet(with VIT base > 1h on 1 V100). It is normal?
The text was updated successfully, but these errors were encountered: