-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Separate endpoint for control messages #543
Conversation
6898e99
to
ca7884f
Compare
bot:aws:retest |
The CI failure looks real; it looks like tracing broke. |
The previous PR (#321 (comment)) had some unresolved perf issues. Have we resolved those at this point? |
I tested up to 16 nodes and did not see any perf issues with the series. Was there any specific case where we saw issues earlier, @bwbarrett ? |
Large message Allreduce, but we've cleaned up a bunch of Libfabric and EFA issues since then, so if we don't see changes, we don't see changes. |
dcef637
to
0f0343c
Compare
Comments addressed and tracing breakage fixed. The previous round of AWS-CI runs had a functional test timeout, will watch this run and test locally to make sure that wasn't CI infra being CI infra. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, please squash out the merge commit before merging. 🙂
bot:aws:retest |
c11ee00
to
1de991c
Compare
rebased via GUI to re-kick off PR CI. |
bot:aws:retest |
1de991c
to
b16b8f3
Compare
b16b8f3
to
cf659bd
Compare
CI cleared, this is ready to go. |
The control messages are not going to be scheduled over multiple rails, so remove the unnecessary logic around it. Signed-off-by: Brian Barrett <bbarrett@amazon.com> Signed-off-by: Raghu Raja <raghunch@amazon.com>
For the long message RDMA protocol, we want to make sure that we never starve the sender for data to move, which means prioritizing control messages from the receiver to the sender. This patch moves both the communicator setup and recv control messages to a new endpoint, which is always on device rail 0. Future patches will optimize polling of the control message cq in the send path and setting priority bits on the control cq. Signed-off-by: Brian Barrett <bbarrett@amazon.com> Signed-off-by: Raghu Raja <raghunch@amazon.com>
If a match isn't found for the current send, poll the control cq to see if the match can be found. While this extends the current send() call, it potentially lowers the time until data transfer starts. Signed-off-by: Brian Barrett <bbarrett@amazon.com> Signed-off-by: Raghu Raja <raghunch@amazon.com>
cf659bd
to
ce1654a
Compare
PR aws#543 has moved the control message to its own dedicated endpoint on a single rail. As a result, the control message is not sent on all rails in a round-robin fashion anymore. This has impacted performance in some cases, so this is adding an environment variable to optionally enable the separate control message endpoint, but for now we use as a default the old behavior of sending the control message round-robining across rails. Signed-off-by: Amedeo Sapio <asapio@amazon.com>
PR aws#543 has moved the control message to its own dedicated endpoint on a single rail. As a result, the control message is not sent on all rails in a round-robin fashion anymore. This has impacted performance in some cases, so this is adding an environment variable to optionally enable the separate control message endpoint, but for now we use as a default the old behavior of sending the control message round-robining across rails. Signed-off-by: Amedeo Sapio <asapio@amazon.com>
PR aws#543 has moved the control message to its own dedicated endpoint on a single rail and with its own CQ. To avoid the cost of polling an additional CQ, this commit is making the control endpoint share the CQ with rail 0. Signed-off-by: Amedeo Sapio <asapio@amazon.com>
PR aws#543 has moved the control message to its own dedicated endpoint on a single rail. As a result, the control message is not sent on all rails in a round-robin fashion anymore. This has impacted performance in some cases, so we want the option to still distribute the control messages across all rails. This commit is providing an environment variable to choose between a) using a single dedicated endpoint for control messages, sharing the CQ with the data endpoint on rail 0; or b) use as many additional endpoints for the control messages as the number of rails, sharing the CQs with the data endpoints, and use the scheduler to round robin the control messages across all rails. Signed-off-by: Amedeo Sapio <asapio@amazon.com>
PR aws#543 has moved the control message to its own dedicated endpoint on a single rail. As a result, the control message is not sent on all rails in a round-robin fashion anymore. This has impacted performance in some cases, so we want the option to still distribute the control messages across all rails. This commit is providing an environment variable to choose between a) using a single dedicated endpoint for control messages, sharing the CQ with the data endpoint on rail 0; or b) use as many additional endpoints for the control messages as the number of rails, sharing the CQs with the data endpoints, and use the scheduler to round robin the control messages across all rails. Signed-off-by: Amedeo Sapio <asapio@amazon.com>
PR aws#543 has moved the control message to its own dedicated endpoint on a single rail and with its own CQ. To avoid the cost of polling an additional CQ, this commit is making the control endpoint share the CQ with rail 0. Signed-off-by: Amedeo Sapio <asapio@amazon.com>
PR aws#543 has moved the control message to its own dedicated endpoint on a single rail. As a result, the control message is not sent on all rails in a round-robin fashion anymore. This has impacted performance in some cases, so we want the option to still distribute the control messages across all rails. This commit is providing an environment variable to choose between a) using a single dedicated endpoint for control messages, sharing the CQ with the data endpoint on rail 0; or b) use as many additional endpoints for the control messages as the number of rails, sharing the CQs with the data endpoints, and use the scheduler to round robin the control messages across all rails. Signed-off-by: Amedeo Sapio <asapio@amazon.com>
PR aws#543 has moved the control message to its own dedicated endpoint on a single rail and with its own CQ. To avoid the cost of polling an additional CQ, this commit is making the control endpoint share the CQ with rail 0. Signed-off-by: Amedeo Sapio <asapio@amazon.com>
PR aws#543 has moved the control message to its own dedicated endpoint on a single rail. As a result, the control message is not sent on all rails in a round-robin fashion anymore. This has impacted performance in some cases, so we want the option to still distribute the control messages across all rails. This commit is providing an environment variable to choose between a) using a single dedicated endpoint for control messages, sharing the CQ with the data endpoint on rail 0; or b) use as many additional endpoints for the control messages as the number of rails, sharing the CQs with the data endpoints, and use the scheduler to round robin the control messages across all rails. Signed-off-by: Amedeo Sapio <asapio@amazon.com>
PR #543 has moved the control message to its own dedicated endpoint on a single rail and with its own CQ. To avoid the cost of polling an additional CQ, this commit is making the control endpoint share the CQ with rail 0. Signed-off-by: Amedeo Sapio <asapio@amazon.com>
PR #543 has moved the control message to its own dedicated endpoint on a single rail. As a result, the control message is not sent on all rails in a round-robin fashion anymore. This has impacted performance in some cases, so we want the option to still distribute the control messages across all rails. This commit is providing an environment variable to choose between a) using a single dedicated endpoint for control messages, sharing the CQ with the data endpoint on rail 0; or b) use as many additional endpoints for the control messages as the number of rails, sharing the CQs with the data endpoints, and use the scheduler to round robin the control messages across all rails. Signed-off-by: Amedeo Sapio <asapio@amazon.com>
Rebased and reworked #321 on master.
The commits are largely the same with adaptations to recent changes around MR (with the addition of the cache) and fix for a bug that was causing an fi_av_insert() on the control QP's address to fail (as the conn message did not carry the correct address).
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.