Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPIR-V] Atomics can't be used with vk::RawBuffer commands #4983

Closed
natevm opened this issue Jan 31, 2023 · 11 comments
Closed

[SPIR-V] Atomics can't be used with vk::RawBuffer commands #4983

natevm opened this issue Jan 31, 2023 · 11 comments
Labels
spirv Work related to SPIR-V

Comments

@natevm
Copy link
Contributor

natevm commented Jan 31, 2023

Due to some limitations with HLSL involving device addressing, we're making heavy use of HLSL's vk::RawBufferLoad and vk::RawBufferStore commands for our bindless rendering architecture.

However, I'm running into an issue where this effectively makes all atomic operations impossible...

HLSL's current InterlockedCompareExchange function expects a resource value like a RWStructuredBuffer. And there is currently no way to construct a RWStructuredBuffer from a raw vulkan address.

So, I could potentially see two directions here...

I'm not sure if this is possible, but it would be ideal if we could construct a RWStructuredBuffer from a vulkan raw device address (and also potentially a size in bytes?) and then be able to use that official resource handle for atomic interlocked HLSL intrinsics...

Or alternatively, perhaps we could add a vk::RawBufferInterlockedCompareExchange function which would act on the raw vulkan device address itself...

@devshgraphicsprogramming

the whole vk:::RawBuffer design seems to be riddled with bugs, the OpAccessChains are not being constructed properly for loads/stores, see #4986

@natevm
Copy link
Contributor Author

natevm commented Feb 1, 2023

the whole vk:::RawBuffer design seems to be riddled with bugs, the OpAccessChains are not being constructed properly for loads/stores, see #4986

This seems mostly irrelevant. What does this have to do with atomics?

@devshgraphicsprogramming

the whole vk:::RawBuffer design seems to be riddled with bugs, the OpAccessChains are not being constructed properly for loads/stores, see #4986

This seems mostly irrelevant. What does this have to do with atomics?

In GLSL you can declare a bunch of nested structs and you can perform an atomicExchange(buffer_reference.memberBInA.memberCInB.myUint, 45) without a problem.

With the way vk::RawBuffer currently works, you'd have to do all the pointer arithmetic to find out the address from the base pointer to feed as an argument

@natevm
Copy link
Contributor Author

natevm commented Feb 1, 2023

The behavior of GLSL is also mostly irrelevant, as it pertains to how glslc compiles to SPIR-V.

The HLSL programming language is different from GLSL in how atomics work. In HLSL, we have InterlockedCompareExchange, which requires a "resource" type.

Hence why we either 1) construct such a resource handle from a raw device address, or 2) implement a vk specific intrinsic that works on the raw device address.

Re pointer arithmetic, again that is unrelated to this issue. I am assuming that users do their own pointer arithmetic (like we do in a simple templated function wrapping vk::RawBuffer commands)

@natevm
Copy link
Contributor Author

natevm commented Feb 1, 2023

So, if I follow the SPIR-V spec right, we'd potentially want to emit something roughly like the following:

// CHECK: [[addr:%\d+]] = OpLoad %ulong
// CHECK-NEXT: [[newValue:%\d+]] = OpLoad %int %variable
// CHECK-NEXT: [[comparatorValue:%\d+]] = OpLoad %int %comparator
// CHECK-NEXT: [[buf:%\d+]] = OpBitcast %_ptr_PhysicalStorageBuffer_int [[addr]]
// CHECK-NEXT: [[result:%\d+]] = OpAtomicCompareExchange [[buf]] Device AcquireRelease None [[newValue]] [[comparatorValue]]
// CHECK-NEXT: OpStore %originalValue [[result]]

For the Scope, since this is a raw device address, I'd assume "Device" is the right choice here. For shared memory atomics, those already work in SPIR-V though the built-in InterlockedCompareExchange call.

I think for the "Equal" memory semantics this should be AcquireRelease, since we exchange when the original value equals the comparator.

For the "Unequal" memory semantics, per spec this can't be Release or AcquireRelease. I think it should be None then? Since at this point we know the exchange isn't going to happen?

@devshgraphicsprogramming

The HLSL programming language is different from GLSL in how atomics work. In HLSL, we have InterlockedCompareExchange, which requires a "resource" type.

Don't care, you can make whatever API you want as long as it emits the correct DXIL or SPIR-V.

Re pointer arithmetic, again that is unrelated to this issue. I am assuming that users do their own pointer arithmetic (like we do in a simple templated function wrapping vk::RawBuffer commands)

Well thats rather painful compared to how transparently GLSL buffer device address extension works.

So, if I follow the SPIR-V spec right, we'd potentially want to emit something roughly like the following:

If you want to emit same SPIR-V as GLSL you'd declare an OpAccessChain from your pointer type to the uint member you want to do atomics on, and use that there

As for the memory semantics, I don't know but @nanokatze would be the person to ask.

@natevm
Copy link
Contributor Author

natevm commented Feb 1, 2023

Well thats rather painful compared to how transparently GLSL buffer device address extension works.

Ugly but I've also found it to be quite powerful. It's simple to write a template like this if you want:

template <typename T>
T load(uint64_t buffer, uint64_t index) {
  return vk::RawBufferLoad<T>(buffer + index * sizeof(T));
}

IMO though, pointer arithmetic is outside the scope of this issue. I wouldn't accept a PR to this issue if it proposed sweeping changes to how RawBufferLoad/Store handle addressing, just to get an atomic instruction working.

@devshgraphicsprogramming

I have a powerful but userspace fix here: https://godbolt.org/z/TG6Pz1vvz

However the only thing I'm missing is being able to form a correct OpAccessChain given a uint64_t address and builtin type.

@devshgraphicsprogramming

I'm trying and trying to construct an pointer to a %uint and failing miserably
https://godbolt.org/z/WcdE4obWh

If only we could have a RawBuffer<T> which gives us the OpBitcast, then we could shove them into atomic as defined by inline spir-v

@s-perron
Copy link
Collaborator

Is there anything requested here that will not be addressed by the BufferPointer proposal in HLSL spec? I believe you can use the address to get a vk::BufferPointer object and then use the Get() function to return a l-value that can be passed to the atomic operation functions. This essentially gives what was asked for in #5640.

I do not want to try to fix the RawBuffer* functions. There is no good fix because we exposed the loads and store explicitly, and do not have something that gives an l-value. There is not much we can do with it.

@devshgraphicsprogramming

we actually got it working with Proposal 0011 made our own frankenstein __ref and __ptr structs + some overloads of spirv-intrinsics for atomicAdd

https://github.com/Devsh-Graphics-Programming/Nabla/pull/690/files

works well enough for histogramming that we made a counting sort example to test it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
spirv Work related to SPIR-V
Projects
None yet
Development

No branches or pull requests

4 participants