Skip to content

Commit

Permalink
update figure names
Browse files Browse the repository at this point in the history
  • Loading branch information
zhypku committed Dec 5, 2024
1 parent c37fa99 commit 18ad118
Show file tree
Hide file tree
Showing 3 changed files with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions docs/Prefill-decoding_Disaggregation.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ Prefill-decoding disaggregation is a technique that computes the prefill and dec
We find Llumnix well-suited for implementing P-D disaggregation, because this technique is inherently a special request scheduling policy and fits well in Llumnix's modeling for request scheduling. Specifically, P-D disaggregation can be decomposed into two rules (shown below): (1) a special dispatching rule, i.e., P-instances-only; and (2) a special migration rule, i.e., migrate to D instances after one step. Llumnix provides an implementation of P-D disaggregation following this principle.

<div align=center>
<img src="./pdd_1.png" align="center" width=80%/>
<img src="./pdd_rationale.png" align="center" width=80%/>
</div>

## Benefits
Expand All @@ -17,7 +17,7 @@ Implementing P-D disaggregation in Llumnix has the following benefits.
3. **Seamlessly integrates with Llumnix's native scheduling capabilities**. In the P-D disaggregation scheme, we still have scheduling decisions to make: which P instance to dispatch, which D instance to migrate. Llumnix's scheduling policies are readily available for them. Moreover, the migration between D instances is still helpful, e.g., for load balancing. The graph below shows the three scheduling behaviors and how Llumnix combines them.

<div align=center>
<img src="./pdd_2.png" align="center" width=80%/>
<img src="./pdd_design.png" align="center" width=80%/>
</div>

## Supported Features
Expand Down
File renamed without changes
File renamed without changes

0 comments on commit 18ad118

Please sign in to comment.