cancel
Showing results for 
Search instead for 
Did you mean: 

OpenCL

melirius
Adept II

Bug in OpenCL runtime

OpenCL runtimes for Windows x64 at least from 15.7.1 drivers on return garbage when queried by clGetKernelWorkGroupInfo with CL_KERNEL_PRIVATE_MEM_SIZE. If there is some spilled registers, then it returns their size in global memory, and that is presumably intended usage of the function (at least Intel OpenCL works this way). But if there is no spills, then it can return 160, 144, 320 or some other number. As a result, it is useless for the kernel optimization—you never know if this number real or fake.

0 Likes
6 Replies
dipak
Big Boss

Thank you for reporting this. It would helpful if you can provide an example with some more information like kernel compilation parameters, expected size, actual size etc.

Thanks.

0 Likes

I stored some OpenCL binaries on Adrenalin 20.5.1 drivers to demonstrate this behaviour, correct reports are in bold:

BinaryDisassembler outputclGetKernelWorkGroupInfo with CL_KERNEL_PRIVATE_MEM_SIZE output
3p_Angles_Int_2d_DCdir_0-Tahiti.bin22312 ISA, 0 scratch, 167/256 VGPR, 71/102 SGPR160 - expected 0
3p_Angles_Int_2d_DCinv_0-Tahiti.bin19464 ISA, 0 scratch, 171/256 VGPR, 71/102 SGPR144 - expected 0
3p_Angles_Int_2d_GGPdir_0-Tahiti.bin16008 ISA, 80 scratch, 113/256 VGPR, 67/102 SGPR320
3p_Angles_Int_2d_EEBRdir_0-Tahiti.bin15248 ISA, 64 scratch, 223/256 VGPR, 63/102 SGPR256
3p_Angles_Int_2d_EEBRinv_0-Tahiti.bin26696 ISA, 48 scratch, 222/256 VGPR, 63/102 SGPR192
3p_Angles_Int_2d_GEPinv_0-Tahiti.bin16208 ISA, 76 scratch, 224/256 VGPR, 63/102 SGPR304

OpenCL files and all other staff is inside the binaries.

0 Likes

Thank you for providing the above information.

As per the spec, clGetKernelWorkGroupInfo with CL_KERNEL_PRIVATE_MEM_SIZE returns "the minimum amount of private memory, in bytes, used by each work-item in the kernel. This value may include any private memory needed by an implementation to execute the kernel, including that used by the language built-ins and variable declared inside the kernel with the __private qualifier."

I guess the reported values also include other private memory usage, not  just the "spilled registers". Anyway, I'll check with the OpenCL team to know whether it is the expected behavior and let you know.

Thanks.

Thanks for the reply. If we take the point of view of this text, then all these values are wrong, as they are definitely lower then the size of VGPRs (and private variables) used by every workitem of the kernels. Awaiting for your answer about intended behaviour.

0 Likes

I think that examples in bold are right because of the answer: "The value we return for CL_KERNEL_PRIVATE_MEM_SIZE is additional private memory that we need per work item, above and beyond what we can store in the register file." by Ben_A_Intel. I was happy at first to find the same behaviour for AMD runtime, but then I found out that when there is no spilled registers, it returns something unexpected.

Thank you for sharing the above information.

From the OpenCL team's feedback, it seems that your understanding/expectation about the usage is correct. In this case, they suspect that it could be broken for SI (GCN1) cards like Tahiti, since we don’t do any development for SI   which uses AMDIL path and we dropped support for AMDIL long ago. Could you please check if it's working with CI family (GCN2) or newer card?

Thanks.

0 Likes