-
-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallel Transform Propagation #17840
Conversation
Getting this up now that it seems to function, marking as draft because it still needs cleanup. |
FYI, I'm merging #17815. The methods there are probably worth benching here. |
I'd love a before / after tracy histogram BTW. I've seen enough tests on Discord to be convinced that this is markedly (3x or so) faster, but it would be lovely to show in the release notes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work. Overall makes sense to me, but I'm going to give this another pass tomorrow and try and understand the logic a little better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just had a few questions. Overall thanks a ton for doing this work, it really helps!
Alright, I am finally going to be able to give this a look. Right out of the gate I find the increase in run variance concerning, but I'm in favor of merging and leaving that as follow up if we can't easily identify the cause. |
If you are referring to my comment, that was about bevy's existing system. It swings wildly +- 1ms. Variance (timing distribution) within a run is consistent for both. Variance between runs in this PR seems better than main. My guess is this is caused by E-cores vs P-cores. The old implementation usually ended up spending most time in a single core, whereas the new one is spread across all available cores in the thread pool. I think this explains why inter-run variance has improved, the work is spread across a mix of P and E cores. |
The CI failure was not caused by this PR, seems to be some sort of MESA failure according to discord. |
Objective
Solution
Testing