Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#18332: Fix BN hang for FPU Kernel #18634

Merged
merged 2 commits into from
Mar 7, 2025
Merged

Conversation

VirdhatchaniKN
Copy link
Contributor

@VirdhatchaniKN VirdhatchaniKN commented Mar 5, 2025

Ticket

#18332

Problem description

Batch norm kernel hangs for larger shapes. This was because the buffer size is set to 2. The kernel reserves freq number of tiles. So when this gets more than 2, there is an hang as it tries to reserve more than the size. Hence unpack and math threads got hung at the copy_tile call while the pack thread was stuck at cb_reserve_back.

What's changed

  • Reduced the number of freq loops in the kernel to ensure that as each tile gets reserved, they also get popped before the next tile gets reserved
  • Removed cb_num buffer as I have rearranged the computation

Checklist

@VirdhatchaniKN VirdhatchaniKN force-pushed the virdhatchani/BN_hang branch 2 times, most recently from 9ff289e to 57c18ab Compare March 5, 2025 15:08
@VirdhatchaniKN VirdhatchaniKN force-pushed the virdhatchani/fpu_bn_hang branch from f08a501 to 4bc2169 Compare March 5, 2025 15:18
@VirdhatchaniKN VirdhatchaniKN force-pushed the virdhatchani/fpu_bn_hang branch from 4bc2169 to adfebf2 Compare March 5, 2025 15:45
tile_regs_commit();

tile_regs_wait();
pack_tile_with_dt(dst0, cb_affine_or_out);
pack_tile_with_dt(0, cb_affine_or_out);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Why was this change needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dst0 is also 0. the value is same. Its just that for binary_dest_reuse_tiles I passed 0 and I followed the same here.

Base automatically changed from virdhatchani/BN_hang to main March 6, 2025 08:22
@VirdhatchaniKN VirdhatchaniKN force-pushed the virdhatchani/fpu_bn_hang branch from adfebf2 to dbb139b Compare March 6, 2025 09:16
@VirdhatchaniKN
Copy link
Contributor Author

@VirdhatchaniKN VirdhatchaniKN force-pushed the virdhatchani/fpu_bn_hang branch from dbb139b to c4d5a9c Compare March 6, 2025 16:46
@VirdhatchaniKN VirdhatchaniKN force-pushed the virdhatchani/fpu_bn_hang branch from c4d5a9c to dc3aed0 Compare March 7, 2025 01:53
@VirdhatchaniKN
Copy link
Contributor Author

Screenshot 2025-03-07 at 7 39 26 AM

@VirdhatchaniKN VirdhatchaniKN merged commit 99f1ff6 into main Mar 7, 2025
41 of 157 checks passed
@VirdhatchaniKN VirdhatchaniKN deleted the virdhatchani/fpu_bn_hang branch March 7, 2025 02:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants