Archives Discussions

laobrasuca · ‎08-13-2012

hi all,

why?! specially given that in OpenCL it is declared during the application runtime (when calling clEnqueueNDRangeKernel()). In OpenGL, one is able to specify only the number of work groups during the application runtime (with glDispatchCompute()). We lose in flexibility by imposing it during the compute shader compilation time. Any evolution expected on this matter?

another question, is there any restriction on the value specified by local_size_x (layout qualifier) and num_groups_x (input of glDispatchCompute() )? Example, in OpenCL global_work_size must be evenly divisible by local_work_size.

cheers,

lao

gsellers · ‎08-27-2012

Hi Lao,

The reason that the local workgroup size must be included in the shader source code is that there is a compile time dependency on it. In OpenCL, the driver (or runtime) defer compilation of the shader until the local workgroup size is known (i.e., the first time the kernel is dispatched) and it must recompile the kernel if it is launched multiple times with different local dimensions. Because one of the design goals of OpenGL compute shaders is to be simple and light-weight, we decided to make this a compile time requirement rather than something that drivers have to track at run-time.

Also, because this is a compile-time requirement and the dimensions must be a compile-time constant, it is not legal to set this using a uniform. You can, however, use a constant integral expression that could include one or more preprocessor definitions. If you need to launch the same compute shader with multiple local workgroup sizes, you can update the preprocessor definition and compile another variant of the shader. This is what the driver would have to do if we allowed this feature anyway.

Cheers,

Graham

View solution in original post

laobrasuca · ‎08-13-2012

or, is it legal to do the following:

uniform int local_size_x_from_app;

layout (local_size_x = local_size_x_from_app) in;

?

gsellers · ‎08-27-2012

Hi Lao,

The reason that the local workgroup size must be included in the shader source code is that there is a compile time dependency on it. In OpenCL, the driver (or runtime) defer compilation of the shader until the local workgroup size is known (i.e., the first time the kernel is dispatched) and it must recompile the kernel if it is launched multiple times with different local dimensions. Because one of the design goals of OpenGL compute shaders is to be simple and light-weight, we decided to make this a compile time requirement rather than something that drivers have to track at run-time.

Also, because this is a compile-time requirement and the dimensions must be a compile-time constant, it is not legal to set this using a uniform. You can, however, use a constant integral expression that could include one or more preprocessor definitions. If you need to launch the same compute shader with multiple local workgroup sizes, you can update the preprocessor definition and compile another variant of the shader. This is what the driver would have to do if we allowed this feature anyway.

Cheers,

Graham

laobrasuca · ‎08-27-2012

hi Graham,

great pleasure to see you around here and thank you very much for taking the time to answer me. I was like: "wow, is this THE Graham Sellers that I see in all the OpenGL extensions?"

thank you for all those clarifications. I was indeed wondering how much the simplification of the process was important on this decision, but I really didn't know about the runtime compilation for the OpenCL kernels. It makes things really clear for me now.

thank you again,

cheers,

leo

Archives Discussions

[compute shader] why is the work group size specified inside the shader code (local_size_x)?!