Skip to content

(Single-card) Device perf regressions #5167

(Single-card) Device perf regressions

(Single-card) Device perf regressions #5167

Manually triggered March 3, 2025 10:14
Status Failure
Total duration 4h 5m 19s
Artifacts 3

perf-device-models.yaml

on: workflow_dispatch
build-artifact-profiler  /  ...  /  check-docker-images
9s
build-artifact-profiler / build-docker-image / check-docker-images
build-artifact-profiler  /  ...  /  🐳️ Build image
0s
build-artifact-profiler / build-docker-image / 🐳️ Build image
build-artifact-profiler  /  🛠️ Build Release ubuntu 20.04
7m 0s
build-artifact-profiler / 🛠️ Build Release ubuntu 20.04
Matrix: device-perf / device-perf
Fit to window
Zoom out
Zoom in

Annotations

4 errors, 6 warnings, and 9 notices
device-perf / N300 WH B0 device perf
Process completed with exit code 2.
device-perf / N300 WH B0 device perf: models/demos/wormhole/stable_diffusion/tests/test_unet_2d_condition_model.py#L151
test_unet_2d_condition_model_512x512[2-4-64-64-device_params=l1_small_size_24576] RuntimeError: TT_FATAL @ /work/ttnn/cpp/ttnn/operations/eltwise/binary/device/binary_device_operation.cpp:132: input_tensor_a.shard_spec().value() == input_tensor_b->shard_spec().value() info: Error backtrace: --- /usr/local/lib/python3.8/dist-packages/ttnn/_ttnn.cpython-38-x86_64-linux-gnu.so(+0xea1649) [0x7f5fccf6c649] --- ttnn::operations::binary::BinaryDeviceOperation::validate_on_program_cache_miss(ttnn::operations::binary::BinaryDeviceOperation::operation_attributes_t const&, ttnn::operations::binary::BinaryDeviceOperation::tensor_args_t const&) --- /usr/local/lib/python3.8/dist-packages/ttnn/_ttnn.cpython-38-x86_64-linux-gnu.so(_ZN4ttnn16device_operation6detail23launch_on_worker_threadINS_10operations6binary21BinaryDeviceOperationEN2tt3stl10StrongTypeIhNS_10QueueIdTagEEElNS5_22operation_attributes_tENS5_13tensor_args_tENS6_8tt_metal6TensorEPNSD_2v07IDeviceEEEvT0_T1_RKT2_RKT3_RT4_RT5_+0x33f) [0x7f5fcd3581bf] --- ttnn::operations::binary::BinaryDeviceOperation::tensor_return_value_t ttnn::device_operation::detail::launch_on_single_device<ttnn::operations::binary::BinaryDeviceOperation>(tt::stl::StrongType<unsigned char, ttnn::QueueIdTag>, ttnn::operations::binary::BinaryDeviceOperation::operation_attributes_t const&, ttnn::operations::binary::BinaryDeviceOperation::tensor_args_t const&) --- /usr/local/lib/python3.8/dist-packages/ttnn/_ttnn.cpython-38-x86_64-linux-gnu.so(+0x128cc2f) [0x7f5fcd357c2f] --- ttnn::operations::binary::BinaryDeviceOperation::tensor_return_value_t ttnn::device_operation::detail::invoke<ttnn::operations::binary::BinaryDeviceOperation>(tt::stl::StrongType<unsigned char, ttnn::QueueIdTag>, ttnn::operations::binary::BinaryDeviceOperation::operation_attributes_t const&, ttnn::operations::binary::BinaryDeviceOperation::tensor_args_t const&) --- /usr/local/lib/python3.8/dist-packages/ttnn/_ttnn.cpython-38-x86_64-linux-gnu.so(+0x128c42a) [0x7f5fcd35742a] --- /usr/local/lib/python3.8/dist-packages/ttnn/_ttnn.cpython-38-x86_64-linux-gnu.so(+0x128c04b) [0x7f5fcd35704b] --- /usr/local/lib/python3.8/dist-packages/ttnn/_ttnn.cpython-38-x86_64-linux-gnu.so(_ZN4ttnn10operations6binary15BinaryOperationILNS1_12BinaryOpTypeE0EE6invokeEN2tt3stl10StrongTypeIhNS_10QueueIdTagEEERKNS5_8tt_metal6TensorESD_RKNSt3__18optionalIKNSA_8DataTypeEEERKNSF_INSA_12MemoryConfigEEERKNSF_ISB_EERKNSF_INSE_6vectorINS0_5unary14UnaryWithParamENSE_9allocatorISU_EEEEEERKNSF_ISU_EE+0x7b) [0x7f5fcd32de1b] --- /usr/local/lib/python3.8/dist-packages/ttnn/_ttnn.cpython-38-x86_64-linux-gnu.so(+0x2ca8b68) [0x7f5fced73b68] --- /usr/local/lib/python3.8/dist-packages/ttnn/_ttnn.cpython-38-x86_64-linux-gnu.so(+0x2ca8214) [0x7f5fced73214] --- void tt::tt_metal::operation::launch_op_func<std::__1::vector<tt::tt_metal::Tensor, std::__1::allocator<tt::tt_metal::Tensor>>>(std::__1::function<std::__1::vector<tt::tt_metal::Tensor, std::__1::allocator<tt::tt_metal::Tensor>> (std::__1::vector<tt::tt_metal::Tensor, std::__1::allocator<tt::tt_metal::Tensor>> const&, std::__1::vector<std::__1::optional<tt::tt_metal::Tensor const>, std::__1::allocator<std::__1::optional<tt::tt_metal::Tensor const>>> const&, std::__1::vector<std::__1::optional<tt::tt_metal::Tensor>, std::__1::allocator<std::__1::optional<tt::tt_metal::Tensor>>> const&)> const&, std::__1::vector<tt::tt_metal::Tensor, std::__1::allocator<tt::tt_metal::Tensor>>, std::__1::vector<tt::tt_metal::Tensor, std::__1::allocator<tt::tt_metal::Tensor>>&, std::__1::vector<std::__1::optional<tt::tt_metal::Tensor const>, std::__1::allocator<std::__1::optional<tt::tt_metal::Tensor const>>>, std::__1::vector<std::__1::optional<tt::tt_metal::Tensor>, std::__1::allocator<std::__1::optional<tt::tt_metal::Tensor>>>) --- /usr/local/lib/python3.8/dist-packages/ttnn/_ttnn.cpython-38-x86_64-linux-gnu.so(+0x2ca7b0b) [0x7f5fced72b0b] --- /usr/local/lib/python3.8/dist-packages/ttnn/_ttnn.cpython-38-x86_64-linux-gnu.so(+0x2ca7471) [0x7f5fced72471] --- /usr/local/lib/python3.8/dist-packages/ttnn/_ttnn.cpython-38-x86_64-linux-gn
device-perf / N300 WH B0 device perf: models/demos/wormhole/stable_diffusion/tests/test_perf.py#L229
test_stable_diffusion_device_perf[9.5] IndexError: index 0 is out of bounds for axis 0 with size 0
device-perf / N300 WH B0 device perf
Process completed with exit code 1.
hugepages-service-not-found-startup
Hugepages service not found. Using old rc.local method
device-perf / N300 WH B0 device perf: models/demos/wormhole/stable_diffusion/tests/test_unet_2d_condition_model.py#L63
record_property is incompatible with junit_family 'xunit2' (use 'legacy' or 'xunit1')
device-perf / N300 WH B0 device perf: usr/local/lib/python3.8/dist-packages/huggingface_hub/file_download.py#L1142
`resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
device-perf / N300 WH B0 device perf: work/models/demos/wormhole/stable_diffusion/custom_preprocessing.py#L32
The use of `x.T` on tensors of dimension other than 2 to reverse their shape is deprecated and it will throw an error in a future release. Consider `x.mT` to transpose batches of matrices or `x.permute(*torch.arange(x.ndim - 1, -1, -1))` to reverse the dimensions of a tensor. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3637.)
device-perf / N300 WH B0 device perf: models/demos/wormhole/stable_diffusion/tests/test_perf.py#L206
record_property is incompatible with junit_family 'xunit2' (use 'legacy' or 'xunit1')
device-perf / N300 WH B0 device perf: work/tt_metal/tools/profiler/process_model_log.py#L26
Columns (39,40,41,46,47,48,53,54,55,67,68,69,74,75,76) have mixed types. Specify dtype option on import or set low_memory=False.
disk-usage-after-startup
Disk usage is 54 %
printing-smi-info-startup
Touching and printing out SMI info
reset-successful-startup
tt-smi reset was successful
hugepages-setup-success-startup
Hugepages is now setup.
disk-usage-after-startup
Disk usage is 66 %
printing-smi-info-startup
Touching and printing out SMI info
reset-successful-startup
tt-smi reset was successful
hugepages-service-found-startup
Hugepages service found. Command returned with exit code 3. Restarting it so we can ensure hugepages are available
hugepages-setup-success-startup
Hugepages is now setup.

Artifacts

Produced during runtime
Name Size
TTMetal_build_any_profiler
192 MB
eager-dist-ubuntu-20.04-any-profiler
340 MB
packages-ubuntu-20.04-amd64-Release-x86_64-linux-clang-17-libcpp_profiler
89.5 MB