Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#13385: ttnn.round direct kernel implementation #17673

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

mcw-anasuya
Copy link
Contributor

@mcw-anasuya mcw-anasuya commented Feb 6, 2025

Ticket

#13385

Problem description

ttnn.round is currently implemented as a combination of ttnn.floor, ttnn.add, etc. This op can be implemented as a kernel op. #13851 has laid the groundwork for this issue to which I have added a few modifications.

What's changed

  • Write a kernel function for ttnn.round that directly calls casting instructions.
  • Replace current implementation with the kernel function.

Checklist

@mcw-anasuya mcw-anasuya force-pushed the anasuya/round_op branch 9 times, most recently from 39efb99 to 8f140f3 Compare February 13, 2025 16:27
@mcw-anasuya mcw-anasuya force-pushed the anasuya/round_op branch 3 times, most recently from 3888957 to 18f18b4 Compare March 5, 2025 10:50
Comment on lines 44 to 52
const float TABLE[] = {
1e-45F, 1e-44F, 1e-43F, 1e-42F, 1e-41F, 1e-40F, 1e-39F, 1e-38F, 1e-37F, 1e-36F, 1e-35F, 1e-34F,
1e-33F, 1e-32F, 1e-31F, 1e-30F, 1e-29F, 1e-28F, 1e-27F, 1e-26F, 1e-25F, 1e-24F, 1e-23F, 1e-22F,
1e-21F, 1e-20F, 1e-19F, 1e-18F, 1e-17F, 1e-16F, 1e-15F, 1e-14F, 1e-13F, 1e-12F, 1e-11F, 1e-10F,
1e-9F, 1e-8F, 1e-7F, 1e-6F, 1e-5F, 1e-4F, 1e-3F, 1e-2F, 1e-1F, 1e0F, 1e1F, 1e2F,
1e3F, 1e4F, 1e5F, 1e6F, 1e7F, 1e8F, 1e9F, 1e10F, 1e11F, 1e12F, 1e13F, 1e14F,
1e15F, 1e16F, 1e17F, 1e18F, 1e19F, 1e20F, 1e21F, 1e22F, 1e23F, 1e24F, 1e25F, 1e26F,
1e27F, 1e28F, 1e29F, 1e30F, 1e31F, 1e32F, 1e33F, 1e34F, 1e35F, 1e36F, 1e37F, 1e38F,
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would suggest moving this table out of calculate_round<APPROXIMATE, ITERATIONS> and declaring it as inline constexpr instead of const to ensure that this table is statically initialized instead of stack initialized, and also so that it isn't materialized separately for each instantiation of calculate_round.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@patrickroberts Moved the table outside the loop as you suggested.

ALWI void round_tile_init() { MATH((llk_math_eltwise_unary_sfpu_round_init<APPROX>())); }

/**
* Performs round operation on each row of a tile.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can make the formatting of the comment nice by using clang on/off now, see other files in this DIR for reference

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants