-
Notifications
You must be signed in to change notification settings - Fork 37
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[shortfin] Implement async alloc/dealloc of buffers. (#507)
* Device allocations are now async, queue ordered alloc/dealloc. * Program invocations asynchronously deallocate function call results if it can. If it ever cannot, then a small tracy zone `SyncImportTimelineResource` will be emitted per result that cannot be async deallocated. * Adds `ProgramInvocation.assume_no_alias` instance boolean to disable the assumption which allows async deallocation to work. * Adds global `ProgramIncovation.global_no_alias` property to control process-wide. This is a very fiddly optimization which requires (esp in multi-device cases) a number of things to line up. Tested on amdgpu and CPU with a number of sample workloads (with logging enabled and visually confirmed). See #980 for detailed analysis and further work required.
- Loading branch information
1 parent
888a98a
commit b299af3
Showing
14 changed files
with
569 additions
and
134 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.