-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose more creation functions for MeshDevice, change SubMesh ownership #18694
base: main
Are you sure you want to change the base?
Conversation
size_t num_command_queues = 1, | ||
const DispatchCoreConfig& dispatch_core_config = DispatchCoreConfig{}, | ||
tt::stl::Span<const std::uint32_t> l1_bank_remap = {}); | ||
static std::map<int, std::shared_ptr<MeshDevice>> create_single_devices( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if this needs to be map?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made it a map to mimic CreateDevices, it simplifies the migration process
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
neat idea to invert the ownership
Ticket
Problem description
I'm integrating MeshDevice into TT-NN, trying to make all python code to always use MeshDevice 1x1 instead of single device APIs. To make MeshDevice APIs compatible, we need a way to create a single MeshDevice with specifying device id.
Moreover, we need an ability to create MeshDevice 1x1 for each of multiple device ids to fully mimic CreateDevices API exposed to python. Creating MeshDevices 1 by 1 for each device id is not equivalent to creating/closing them at the same time (causes fails in inividual calls to metal's CreateDevice CloseDevice). So I added a helper function to create a common mesh device and then create submeshes out of it. This presents a new ownership issue, because the parent mesh device must be kept alive longer than the children devices, however the current python code wasn't designed to handle that. So this PR inverts ownership relationship - each submesh keeps a shared_ptr to the parent mesh, ensuring that its being kept alive until the submesh is destroyed.
What's changed
Added new creation functions for MeshDevice
Changed the ownership model, parent MeshDevice doesn't keep submeshes meshes alive, but the submeshes keep the parent one alive. I think it makes some sense and simplifies
MeshDevice::close()
, but I'm open to discussion on possibly better ways to achieve the goal.Removed the usages of
get_submeshes
method.Checklist