forked from nod-ai/shark-ai
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[shortfin-sd] Reusable service command buffers (nod-ai#963)
This PR non-trivially restructures the SDXL inference process to interface with reusable command buffers. At service startup, we attach to each fiber a set of reusable command buffers (a set of empty device arrays to be reused throughout the inference process), and upon assigning each exec request a fiber (now the "meta_fiber" equipped with command buffers), we pop an appropriate command buffer from the meta fiber, mount the command buffer on the request, and return the command buffer to the meta_fiber for reuse. Each meta_fiber is equipped with an explicit number of command buffers, and if this count is exceeded by the number of InferenceExecProcesses assigned to the fiber, a new command buffer will be initialized ad-hoc. This _shouldn't_ happen if service is managed properly, as we shouldn't oversubscribe fibers too heavily (thundering herd). These changes also simplify the request structure throughout the inference processor, operating on a _single_ request as opposed to a bundle. This greatly simplifies the work required to enable larger batch sizes, and reduces how much we need to handle program I/O during execution. We also remove all `await device` usage where it wasn't required. It should only be used after VAE decode, where we are done with the GPU and need to use the result contents of a device->host transfer. fyi @daveliddell --------- Co-authored-by: Ean Garvey <ean.garvey@amd.com>
- Loading branch information
Showing
5 changed files
with
908 additions
and
378 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.