-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#13385: ttnn.round direct kernel implementation #17673
base: main
Are you sure you want to change the base?
Conversation
39efb99
to
8f140f3
Compare
3888957
to
18f18b4
Compare
tt_metal/hw/ckernels/blackhole/metal/llk_api/llk_sfpu/ckernel_sfpu_round.h
Show resolved
Hide resolved
18f18b4
to
9039d15
Compare
9039d15
to
5f7f1e3
Compare
const float TABLE[] = { | ||
1e-45F, 1e-44F, 1e-43F, 1e-42F, 1e-41F, 1e-40F, 1e-39F, 1e-38F, 1e-37F, 1e-36F, 1e-35F, 1e-34F, | ||
1e-33F, 1e-32F, 1e-31F, 1e-30F, 1e-29F, 1e-28F, 1e-27F, 1e-26F, 1e-25F, 1e-24F, 1e-23F, 1e-22F, | ||
1e-21F, 1e-20F, 1e-19F, 1e-18F, 1e-17F, 1e-16F, 1e-15F, 1e-14F, 1e-13F, 1e-12F, 1e-11F, 1e-10F, | ||
1e-9F, 1e-8F, 1e-7F, 1e-6F, 1e-5F, 1e-4F, 1e-3F, 1e-2F, 1e-1F, 1e0F, 1e1F, 1e2F, | ||
1e3F, 1e4F, 1e5F, 1e6F, 1e7F, 1e8F, 1e9F, 1e10F, 1e11F, 1e12F, 1e13F, 1e14F, | ||
1e15F, 1e16F, 1e17F, 1e18F, 1e19F, 1e20F, 1e21F, 1e22F, 1e23F, 1e24F, 1e25F, 1e26F, | ||
1e27F, 1e28F, 1e29F, 1e30F, 1e31F, 1e32F, 1e33F, 1e34F, 1e35F, 1e36F, 1e37F, 1e38F, | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would suggest moving this table out of calculate_round<APPROXIMATE, ITERATIONS>
and declaring it as inline constexpr
instead of const
to ensure that this table is statically initialized instead of stack initialized, and also so that it isn't materialized separately for each instantiation of calculate_round
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@patrickroberts Moved the table outside the loop as you suggested.
5f7f1e3
to
d92f1cb
Compare
d92f1cb
to
46408eb
Compare
ALWI void round_tile_init() { MATH((llk_math_eltwise_unary_sfpu_round_init<APPROX>())); } | ||
|
||
/** | ||
* Performs round operation on each row of a tile. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can make the formatting of the comment nice by using clang on/off now, see other files in this DIR for reference
Ticket
#13385
Problem description
ttnn.round is currently implemented as a combination of ttnn.floor, ttnn.add, etc. This op can be implemented as a kernel op. #13851 has laid the groundwork for this issue to which I have added a few modifications.
What's changed
Checklist