bubu

Suggestions for OpenCL 2

Discussion created by bubu on Sep 4, 2010
Latest reply on Oct 4, 2010 by tak0xff

Please, add this:

 

1. Local/global atomics for FLOAT variables. This could be used by a lot of sorting algorithms, order-independent transparency, etc... I heard Crytek was demanding it also.

 

2. Enable the "register" C keyword hint so we can effectively control better which variables can be swapped to memory and which ones should stay as registers. This can help the compiler to reduce the register pressure in a better way.

 

3. Add C++ support.

 

4. Add a "virtual memory" mechanism and flags for OpenCL's buffers to indicate its contents must be flushed to the hard disk like the CPU's virtual memory is swapped. This is needed to manage big data assets that don't fit in the (usually low-quantity) GPU's video memory. DX9 used a "managed memory pool" mechanism for instance.

 

5. Add a kernel execution priority parameter. With this, we could execute kernels without disturbing the OS's window manager and to indicate which ones are more important for concurrent kernel execution.

 

6. Add some reduction macros or functions for +, -, *, /, min/max, etc... And add a quick-sort ( or radix sort ) intrinsic:

Example:

 

 

__kernel void MyKernel ( __global float *values ) { const float sumOfAllValues = CL_REDUCTION_SUM ( values, 0, 256 ); //ptr, offset, nElements const float minOfAllValues = CL_REDUCTION_MIN ( values, 0, 256 ); const float maxOfAllValues = CL_REDUCTION_MAX ( values, 0, 256 ); qsort ( values, 0, sizeof(float), 256 ); //offset, sizeof each element, nElements ... }

Outcomes