You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Running for N>128 passes validation, but crashes in benchmark.
averyjam@nid005021:~/dualize/LockstepDualisation/build> ./validation/sycl/sycl_validation gpu 200 200
Validating SYCL implementation for gpu device: gfx90a:sramecc+:xnack-.
N = 200
Success!
averyjam@nid005021:~/dualize/LockstepDualisation/build> ./benchmarks/sycl/sycl_benchmark gpu 200
Dualising 1000000 triangulation graphs, each with 200 triangles, repeated 10 times and with 1 warmup runs.
Platform: Intel(R) FPGA Emulation Platform for OpenCL(TM)
NOT USING: Intel(R) FPGA Emulation Device has 4 compute-units.
Platform: Intel(R) OpenCL
NOT USING: AMD EPYC 7A53 64-Core Processor has 4 compute-units.
Platform: AMD HIP BACKEND
USING : gfx90a:sramecc+:xnack- has 110 compute-units.
Using 1 gpu-devices
/users/averyjam/dualize/LockstepDualisation/src/sycl/dual.cc:31: K DeviceDualGraph<6, unsigned short>::dedge_ix(const K, const K) const [MaxDegree = 6, K = unsigned short]: global id: [1377767,0,0], local id: [167,0,0] Assertion `false` failed.
/users/averyjam/dualize/LockstepDualisation/src/sycl/dual.cc:31: K DeviceDualGraph<6, unsigned short>::dedge_ix(const K, const K) const [MaxDegree = 6, K = unsigned short]: global id: [1377768,0,0], local id: [168,0,0] Assertion `false` failed.
/users/averyjam/dualize/LockstepDualisation/src/sycl/dual.cc:31: K DeviceDualGraph<6, unsigned short>::dedge_ix(const K, const K) const [MaxDegree = 6, K = unsigned short]: global id: [1377769,0,0], local id: [169,0,0] Assertion `false` failed.
/users/averyjam/dualize/LockstepDualisation/src/sycl/dual.cc:31: K DeviceDualGraph<6, unsigned short>::dedge_ix(const K, const K) const [MaxDegree = 6, K = unsigned short]: global id: [1377770,0,0], local id: [170,0,0] Assertion `false` failed.
/users/averyjam/dualize/LockstepDualisation/src/sycl/dual.cc:31: K DeviceDualGraph<6, unsigned short>::dedge_ix(const K, const K) const [MaxDegree = 6, K = unsigned short]: global id: [1377771,0,0], local id: [171,0,0] Assertion `false` failed.
/users/averyjam/dualize/LockstepDualisation/src/sycl/dual.cc:31: K DeviceDualGraph<6, unsigned short>::dedge_ix(const K, const K) const [MaxDegree = 6, K = unsigned short]: global id: [1377772,0,0], local id: [172,0,0] Assertion `false` failed.
/users/averyjam/dualize/LockstepDualisation/src/sycl/dual.cc:31: K DeviceDualGraph<6, unsigned short>::dedge_ix(const K, const K) const [MaxDegree = 6, K = unsigned short]: global id: [1377773,0,0], local id: [173,0,0] Assertion `false` failed.
/users/averyjam/dualize/LockstepDualisation/src/sycl/dual.cc:31: K DeviceDualGraph<6, unsigned short>::dedge_ix(const K, const K) const [MaxDegree = 6, K = unsigned short]: global id: [1377774,0,0], local id: [174,0,0] Assertion `false` failed.
/users/averyjam/dualize/LockstepDualisation/src/sycl/dual.cc:31: K DeviceDualGraph<6, unsigned short>::dedge_ix(const K, const K) const [MaxDegree = 6, K = unsigned short]: global id: [1377776,0,0], local id: [176,0,0] Assertion `false` failed.
/users/averyjam/dualize/LockstepDualisation/src/sycl/dual.cc:31: K DeviceDualGraph<6, unsigned short>::dedge_ix(const K, const K) const [MaxDegree = 6, K = unsigned short]: global id: [1377778,0,0], local id: [178,0,0] Assertion `false` failed.
:0:rocdevice.cpp :2652: 1910724722915 us: 1686 : [tid:0x14a7b1aef700] Device::callbackQueue aborting with error : HSA_STATUS_ERROR_EXCEPTION: An HSAIL operation resulted in a hardware exception. code: 0x1016
Aborted
Running for N<=128 works for both. Why?
averyjam@nid005021:~/dualize/LockstepDualisation/build> ./validation/sycl/sycl_validation gpu 128 128
Validating SYCL implementation for gpu device: gfx90a:sramecc+:xnack-.
N = 128
Success!
averyjam@nid005021:~/dualize/LockstepDualisation/build> ./benchmarks/sycl/sycl_benchmark gpu 128
Dualising 1000000 triangulation graphs, each with 128 triangles, repeated 10 times and with 1 warmup runs.
Platform: Intel(R) FPGA Emulation Platform for OpenCL(TM)
NOT USING: Intel(R) FPGA Emulation Device has 4 compute-units.
Platform: Intel(R) OpenCL
NOT USING: AMD EPYC 7A53 64-Core Processor has 4 compute-units.
Platform: AMD HIP BACKEND
USING : gfx90a:sramecc+:xnack- has 110 compute-units.
Using 1 gpu-devices
Mean Time per Graph: 26.4305 +/- 7.02391 ns
The text was updated successfully, but these errors were encountered:
Running for N>128 passes validation, but crashes in benchmark.
Running for N<=128 works for both. Why?
The text was updated successfully, but these errors were encountered: