From 4d06b844ad0ca0b6cbcd9b51900dff9ce881e0e0 Mon Sep 17 00:00:00 2001 From: Yuka Ikarashi Date: Sun, 10 Nov 2024 15:01:50 -0500 Subject: [PATCH] update readmes --- examples/README.md | 3 +- examples/quiz1/README.md | 4 +-- examples/quiz2/README.md | 73 ++++++++++++++++++++++++++++++++++++++-- 3 files changed, 74 insertions(+), 6 deletions(-) diff --git a/examples/README.md b/examples/README.md index e8da2502..e8c908c4 100644 --- a/examples/README.md +++ b/examples/README.md @@ -9,4 +9,5 @@ If you are new to Exo, we recommend going through the examples in the following 3. [RVM](./rvm_conv1d/README.md): This example illustrates how to use Exo to define and target a new hardware accelerator entirely in the user code. -4. Quizzes ([quiz1](./quiz1/README.md), [quiz2](./quiz2/README.md), [quiz3](./quiz3/README.md)) contain common scheduling mistakes in Exo and solutions to fix them. The best way to learn a programming language is to debug it. +4. Quizzes ([quiz1](./quiz1/README.md), [quiz2](./quiz2/README.md), [quiz3](./quiz3/README.md)) contain common scheduling mistakes in Exo and solutions to fix them. The best way to learn a programming language is by debugging code. + diff --git a/examples/quiz1/README.md b/examples/quiz1/README.md index e10bbf4b..b233a9b3 100644 --- a/examples/quiz1/README.md +++ b/examples/quiz1/README.md @@ -46,7 +46,7 @@ def double(N: size, inp: f32[N] @ DRAM, out: f32[N] @ DRAM): ## Solution -Before calling `replace_all(p, avx_instrs)`, you need to set the buffer memory types to AVX2, because `replace_all` is memory-aware and will only replace code chunks with instructions that have matching memory annotations. +Before calling `replace_all(p, avx_instrs)`, you need to set buffer memory annotations to AVX2, because `replace_all` is memory-aware and will only replace code chunks with instructions that have matching memory annotations. Add the following code before the call to `replace_all`: @@ -56,4 +56,4 @@ Add the following code before the call to `replace_all`: p = set_memory(p, f"{name}_vec", AVX2) ``` -This will ensure that the memory types are correctly set to AVX2 before replacing the code with vector intrinsics. +This will ensure that the memory annotations are correctly set to AVX2 before replacing the code with vector intrinsics. diff --git a/examples/quiz2/README.md b/examples/quiz2/README.md index 85b6527b..24490030 100644 --- a/examples/quiz2/README.md +++ b/examples/quiz2/README.md @@ -1,6 +1,6 @@ # Quiz2! -Loop fissioning and debugging via printing cursors. +This quiz is about loop fission bugs and debugging via printing cursors. ## Incorrect output (compiler error) As written, the schedule has a bug which attempts to incorrectly fission a loop. @@ -64,8 +64,8 @@ def scaled_add_scheduled(N: size, a: f32[N] @ DRAM, b: f32[N] @ DRAM, ## Solution -Have to -`print(vector_assign.after())` after line 37 +To understand the bug, let's first try printing right before the error. +Put `print(vector_assign.after())` after line 37. ``` for io in seq(0, N / 8): vec: R[8] @ DRAM @@ -73,6 +73,73 @@ Have to vec_1: R @ DRAM vec_1 = 2 [GAP - After] + ... ``` +This is showing the code is trying to fission at `[GAP - After]` location, which is unsafe because the `vec_1: R` allocation is in the `ii` loop and before the fissioning point, which means if `vec_1` is used after the fission point that'll be an error. +Change +```python + for i in range(num_vectors): + vector_reg = p.find(f"vec: _ #{i}") + p = expand_dim(p, vector_reg, 8, "ii") + p = lift_alloc(p, vector_reg) + + vector_assign = p.find(f"vec = _ #{i}") + p = fission(p, vector_assign.after()) +``` + +to +```python + for i in range(num_vectors): + vector_reg = p.find(f"vec: _ #{i}") + p = expand_dim(p, vector_reg, 8, "ii") + p = lift_alloc(p, vector_reg) + + for i in range(num_vectors): + vector_assign = p.find(f"vec = _ #{i}") + p = fission(p, vector_assign.after()) +``` + +So that you lift all the allocations out of the loop before fissioning. + +Here is the rewritten version with improved clarity and GitHub Markdown syntax: + +## Solution + +To understand the bug, let's first try printing right before the error. Add the following line after line 37: + +```python +print(vector_assign.after()) +``` + +This will output: + +``` + for io in seq(0, N / 8): + vec: R[8] @ DRAM + for ii in seq(0, 8): + vec_1: R @ DRAM + vec_1 = 2 + [GAP - After] + ... +``` + +The code is attempting to perform fission at the `[GAP - After]` location. +However, this is unsafe because the `vec_1: R` allocation is within the `ii` loop and before the fission point. +If `vec_1` is used after the fission point, the code will no longer be a valid Exo. + +To fix this issue, modify the code as follows: + +```python + for i in range(num_vectors): + vector_reg = p.find(f"vec: _ #{i}") + p = expand_dim(p, vector_reg, 8, "ii") + p = lift_alloc(p, vector_reg) + + for i in range(num_vectors): + vector_assign = p.find(f"vec = _ #{i}") + p = fission(p, vector_assign.after()) +``` + +By separating the allocation lifting and fission operations into two separate loops, you ensure that all the allocations are lifted out of the loop before performing fission. This resolves the issue of unsafe fission due to the allocation being within the loop.