cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

josopait
Journeyman III

Getting UAVs to work on a 7970

Hi,

I'm having difficulties using UAVs on a 7970. I am relying on using the IL language. The compilation of simple test kernels works fine. I had a look into the IL code that is generated by OpenCL sample programs. As far as I know everything should be working. The ISA code looks fine. The linker succeeds. The parameters are matched correctly. But when calling calCtxRunProgram I get the extremely unhelpful error message 'Operational Error'. Other programs that don't use UAVs are working fine.

Are UAVs only valid in compute shaders or can they also be used in pixel shaders? How should the resource be allocated? Is it correct to allocate a 2D resource of single floats, marked as CAL_RESALLOC_GLOBAL_BUFFER?

Do you have any hints as to what could be the problem? Is it perhaps the Linux driver being buggy?

Thanks for any help!

0 Likes
1 Solution
realhet
Miniboss

Hi!

Try the 12.01 catalyst driver for 7970! With the latest driver I've found problems too when using CAL+7970, 12.01 is just fine.

On Evergreen I've used

- pixel_shader

- global buffer

- pinned 2D memory

- vWincoord (fastest thread indexing)

And then on the 7970 It just doung access violations, but I've changed some things:

- compute shader

- uav

- wathever(pinned/local/remote) 1D Linear(flag:global_buffer when you allocate) memory

- vAbsTIdFlat (the fastest one, I guess, then you can split it with fast mul24)

Fortunately the '7970 method' works perfect on Evergreen too (with the latest catalyst 11.10 if I recall).

Here's a test kernel, hope it helps.

il_cs_2_0

dcl_num_thread_per_group 64,1,1

dcl_cb cb0[2]

dcl_raw_uav_id(0)

mov r0.x,vAbsTIdFlat

dcl_literal l1,1,2,0,0

ishl r0.y,r0.x,l1.y

imul24 r0.x,r0.x,cb0[1].x

iadd r0.x,r0.x,l1.x

uav_raw_store_id(0) mem.x, r0.y, r0.x

endmain

end

resource allocation:

cb0: pinned, CAL_FORMAT_UNORM_INT32_4, CAL_RESALLOC_GLOBAL_BUFFER

uav0: pinned, CAL_FORMAT_UNORM_INT32_1, CAL_RESALLOC_GLOBAL_BUFFER

*note that uav format must be 1 component. If you specify 4, Evergreen will allocate only 1/4 amount of memory for it (it's a bug or my misunderstanding)

Finally run the program using RunProgramGrid() where the domain is 64(wavefrontsize) wide!

Hopefully no more black magic will be needed

View solution in original post

0 Likes
2 Replies
realhet
Miniboss

Hi!

Try the 12.01 catalyst driver for 7970! With the latest driver I've found problems too when using CAL+7970, 12.01 is just fine.

On Evergreen I've used

- pixel_shader

- global buffer

- pinned 2D memory

- vWincoord (fastest thread indexing)

And then on the 7970 It just doung access violations, but I've changed some things:

- compute shader

- uav

- wathever(pinned/local/remote) 1D Linear(flag:global_buffer when you allocate) memory

- vAbsTIdFlat (the fastest one, I guess, then you can split it with fast mul24)

Fortunately the '7970 method' works perfect on Evergreen too (with the latest catalyst 11.10 if I recall).

Here's a test kernel, hope it helps.

il_cs_2_0

dcl_num_thread_per_group 64,1,1

dcl_cb cb0[2]

dcl_raw_uav_id(0)

mov r0.x,vAbsTIdFlat

dcl_literal l1,1,2,0,0

ishl r0.y,r0.x,l1.y

imul24 r0.x,r0.x,cb0[1].x

iadd r0.x,r0.x,l1.x

uav_raw_store_id(0) mem.x, r0.y, r0.x

endmain

end

resource allocation:

cb0: pinned, CAL_FORMAT_UNORM_INT32_4, CAL_RESALLOC_GLOBAL_BUFFER

uav0: pinned, CAL_FORMAT_UNORM_INT32_1, CAL_RESALLOC_GLOBAL_BUFFER

*note that uav format must be 1 component. If you specify 4, Evergreen will allocate only 1/4 amount of memory for it (it's a bug or my misunderstanding)

Finally run the program using RunProgramGrid() where the domain is 64(wavefrontsize) wide!

Hopefully no more black magic will be needed

0 Likes

Hi realhet,

thanks a lot for your help! I just got your test kernel working. The problem was that instead of calling calCtxRunProgramGrid() I used calCtxRunProgram(), which somehow didn't compute. Things would be so much easier if the driver simply told me what's wrong. So now I have to tweak the domain parameters. Seems like a number of changes have been made to the hardware since the 5870.

Cheers,

Ingo


0 Likes