Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PYTHON-5144 - Add async performance benchmarks #2188

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

NoahStapp
Copy link
Contributor

First draft.

@NoahStapp NoahStapp requested a review from ShaneHarvey March 7, 2025 19:35

# Platform notes
# Platform notes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to ask if this was an intentional change but then I noticed all these platform notes are long out of date. Let's remove them?


async def do_task(self):
command = self.client.perftest.command
await asyncio.gather(*[command("hello", True) for _ in range(NUM_DOCS)])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this and other asyncio.gather calls can we replace with the concurrent() helper?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what way? Currently each test calls concurrent(self.n_tasks, self.do_task), which results in self.n_tasks calls of self.do_tasks, each of which has a gather call for NUM_DOCS coroutines.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see the real problem. We're running NUM_DOCS concurrent tasks here even though this benchmark is supposed to be serial. We need to change this to:

    async def do_task(self):
        command = self.client.perftest.command
        for _ in range(NUM_DOCS):
            await command("hello", True)

Only the concurrent tests should be using asyncio.gather.

Copy link
Contributor Author

@NoahStapp NoahStapp Mar 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so we need to have separate serial benchmarks and concurrent ones using the same tests? So only the benchmarks with n_tasks > 1 should use gather?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.



# class TestRunCommand800Tasks(TestRunCommand):
# n_tasks = 800
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason these are commented out?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops--these were benchmarks to measure extremely concurrent workloads, they take too long to run regularly imo.


async def concurrent(n_tasks, func):
tasks = [func() for _ in range(n_tasks)]
await asyncio.gather(*tasks)
Copy link
Member

@ShaneHarvey ShaneHarvey Mar 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the threaded version of the tests we run X threads each running a loop of operations so we're comparing different things here. Perhaps we should change the sync version to use a ThreadPoolExecutor and farm out the operations one by one. I'll create a ticket for it.

@NoahStapp
Copy link
Contributor Author


async def do_task(self):
command = self.client.perftest.command
await asyncio.gather(*[command("hello", True) for _ in range(NUM_DOCS)])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see the real problem. We're running NUM_DOCS concurrent tasks here even though this benchmark is supposed to be serial. We need to change this to:

    async def do_task(self):
        command = self.client.perftest.command
        for _ in range(NUM_DOCS):
            await command("hello", True)

Only the concurrent tests should be using asyncio.gather.

@NoahStapp NoahStapp requested a review from ShaneHarvey March 11, 2025 18:30
@NoahStapp
Copy link
Contributor Author

The test failure is also present in master, tracked in jira.mongodb.org/browse/PYTHON-5198.

@NoahStapp
Copy link
Contributor Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants