Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
rdma: Skip eager protocol when writes outstanding
The LL128 protocol causes an interesting pattern where there's a long string of large-ish messages and then a "leftover", most notable on power of 2 message sizes like are used in benchmarks. If that leftover bit is eager sized, you end up with an eager message queued behind a bunch of writes, which doesn't gain you anything in latency hiding the receive request, but then costs you the fixup on the remote side. This patch implements an overly simplistic version of runting, which skips eager if there is already a write outstanding on the communicator. This is wrong for a couple reasons. First, we should probably use a BDP or similar to determine how much data should be in flight before we skip the eager protocol. Second, we should really decrement the counter on cqe processing time, but that is a locking mess. And finally, it should really be a domain counter instead of a communicator counter, but again, see the locking mess comment. The NCCL communication patterns are generally simple enough that none of these really seem to matter, so we go with simplicity for now. Signed-off-by: Brian Barrett <bbarrett@amazon.com>
- Loading branch information