Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for disabling eager (and fix 0 byte eager max size) #794

Merged
merged 3 commits into from
Feb 24, 2025

Conversation

bwbarrett
Copy link
Contributor

Two changes that ended up being the same code: fix 0 byte eager max size and add support for disabling eager messages completely.

0 byte eager support broke with #614, with a number of problems in the code. First, freelist code doesn't work with 0 byte entry size. #789 got us closer, fixing the assert, but some versions of the EFA provider (like that in 1.22 with zero copy) will return an EINVAL when trying to post a receive with len = 0. We split the eager size configuration into the eager_rx_buff_size and eager_send_size explicitly so that the buffer size can be larger than the max send size, and make sure that the eager_rx_buff_size is always larger than 0.

Previously, the EAGER_MAX_SIZE environment variable was an unsigned integer, meaning that the smallest legal value was 0. Since the size is the max size that will be eager sent, that means that there's no way to completely avoid eager messages. This patch series also changes the environment variable (and related code) to be a signed integer, so that we can specify -1 as a legal eager size and completely disable eager messages. When the eager size is less than 0, we don't post rx buffers on the eager endpoint or create the eager free list.

Finally, this patch series sets the default EAGER_MAX_SIZE to -1 to disable eager messages for performance analysis.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Parts of the freelist assume entry_size is larger than 0 bytes, but
the interface does not check for size and does not specify restrictions
either way.  Improve the situation by overriding a 0 byte entry size
to be 8 bytes (plus any additional size increases caused by the
alignment and redzone requirements).  This seems better than a bunch
of 0 byte checks through the rest of the code.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
@bwbarrett bwbarrett requested a review from a team as a code owner February 21, 2025 23:25
Previously, we could only set the eager limit to 0, and even then there
was logic to always allow 0 byte messages to be sent eager.  This patch
makes -1 a valid value for the eager size, which will disable eager
messages entirely (including posting buffers).

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
We don't think we need eager with the performance improvements on P5 and
P5en, so set the default to -1 (disabled).

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
@bwbarrett bwbarrett force-pushed the feature/really-no-more-eager branch from 1673899 to b072322 Compare February 22, 2025 02:25
@xwangxin xwangxin requested a review from yexiang-aws February 23, 2025 21:08
@xwangxin xwangxin merged commit a4832e0 into aws:master Feb 24, 2025
23 checks passed
@bwbarrett bwbarrett deleted the feature/really-no-more-eager branch February 24, 2025 20:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants