Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose more creation functions for MeshDevice, change SubMesh ownership #18694

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

sminakov-tt
Copy link
Contributor

Ticket

Problem description

I'm integrating MeshDevice into TT-NN, trying to make all python code to always use MeshDevice 1x1 instead of single device APIs. To make MeshDevice APIs compatible, we need a way to create a single MeshDevice with specifying device id.
Moreover, we need an ability to create MeshDevice 1x1 for each of multiple device ids to fully mimic CreateDevices API exposed to python. Creating MeshDevices 1 by 1 for each device id is not equivalent to creating/closing them at the same time (causes fails in inividual calls to metal's CreateDevice CloseDevice). So I added a helper function to create a common mesh device and then create submeshes out of it. This presents a new ownership issue, because the parent mesh device must be kept alive longer than the children devices, however the current python code wasn't designed to handle that. So this PR inverts ownership relationship - each submesh keeps a shared_ptr to the parent mesh, ensuring that its being kept alive until the submesh is destroyed.

What's changed

Added new creation functions for MeshDevice
Changed the ownership model, parent MeshDevice doesn't keep submeshes meshes alive, but the submeshes keep the parent one alive. I think it makes some sense and simplifies MeshDevice::close(), but I'm open to discussion on possibly better ways to achieve the goal.
Removed the usages of get_submeshes method.

Checklist

size_t num_command_queues = 1,
const DispatchCoreConfig& dispatch_core_config = DispatchCoreConfig{},
tt::stl::Span<const std::uint32_t> l1_bank_remap = {});
static std::map<int, std::shared_ptr<MeshDevice>> create_single_devices(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if this needs to be map?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made it a map to mimic CreateDevices, it simplifies the migration process

Copy link
Contributor

@cfjchu cfjchu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

neat idea to invert the ownership

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants