1 of 1 people found this helpful
I don't know for sure, I think env. var "set GPU_FORCE_64BIT_PTR=1" could help you. CL_DEVICE_ADDRESS_BITS equals to 64 if this var is set.
AMD OpenCL compiler and runtime implementations require single linear address space for the buffer allocations. So the kernel's binary (ISA) needs 64bit memory access and 64 bit arithmetic for the address calculation in order
to support >4GB of the buffer allocations. Setting environment variable "set GPU_FORCE_64BIT_PTR=1" can force compiler/runtime to generate an ISA that supports 64 bit address calculation and memory access. Hence that
allows OpenCL runtime to report all 6GB of local memory. The main performance issue comes from 64-bit address calculation and not from memory access.
48 bit address in memory descriptor mainly used for images. Images have per texel access with
coordinates and don't really require allocations in a single linear address space. When OpenCL runtime
reports supported memory size, it doesn't know if the application is going to allocate 6GB of buffers or images.
Two correct answers so fast! I see Jive was not designed for that.
Many thanks to both.
Hi German. Thank you for the explanation, but it's still not quite clear what exactly the variable does, especially when kernels are compiled offline. Is this variable affecting compilation, runtime or both? What happens if the variable was set at compile time and is not set at run time (or the other way around)? What happens if host application is a 32-bit one? (so sizeof(size_t) on CPU will be 4).