-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PYTHON-5144 - Add async performance benchmarks #2188
base: master
Are you sure you want to change the base?
Conversation
.evergreen/config.yml
Outdated
|
||
# Platform notes | ||
# Platform notes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going to ask if this was an intentional change but then I noticed all these platform notes are long out of date. Let's remove them?
test/performance/async_perf_test.py
Outdated
|
||
async def do_task(self): | ||
command = self.client.perftest.command | ||
await asyncio.gather(*[command("hello", True) for _ in range(NUM_DOCS)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this and other asyncio.gather
calls can we replace with the concurrent()
helper?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In what way? Currently each test calls concurrent(self.n_tasks, self.do_task)
, which results in self.n_tasks
calls of self.do_tasks
, each of which has a gather
call for NUM_DOCS
coroutines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see the real problem. We're running NUM_DOCS
concurrent tasks here even though this benchmark is supposed to be serial. We need to change this to:
async def do_task(self):
command = self.client.perftest.command
for _ in range(NUM_DOCS):
await command("hello", True)
Only the concurrent tests should be using asyncio.gather
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, so we need to have separate serial benchmarks and concurrent ones using the same tests? So only the benchmarks with n_tasks > 1
should use gather?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
test/performance/async_perf_test.py
Outdated
|
||
|
||
# class TestRunCommand800Tasks(TestRunCommand): | ||
# n_tasks = 800 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason these are commented out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops--these were benchmarks to measure extremely concurrent workloads, they take too long to run regularly imo.
|
||
async def concurrent(n_tasks, func): | ||
tasks = [func() for _ in range(n_tasks)] | ||
await asyncio.gather(*tasks) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the threaded version of the tests we run X threads each running a loop of operations so we're comparing different things here. Perhaps we should change the sync version to use a ThreadPoolExecutor and farm out the operations one by one. I'll create a ticket for it.
test/performance/async_perf_test.py
Outdated
|
||
async def do_task(self): | ||
command = self.client.perftest.command | ||
await asyncio.gather(*[command("hello", True) for _ in range(NUM_DOCS)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see the real problem. We're running NUM_DOCS
concurrent tasks here even though this benchmark is supposed to be serial. We need to change this to:
async def do_task(self):
command = self.client.perftest.command
for _ in range(NUM_DOCS):
await command("hello", True)
Only the concurrent tests should be using asyncio.gather
.
The test failure is also present in master, tracked in jira.mongodb.org/browse/PYTHON-5198. |
A run of the latest changes: https://spruce.mongodb.com/version/67d0840be58c830007b75115/tasks?sorts=STATUS%3AASC%3BBASE_STATUS%3ADESC. |
First draft.