cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

drallan
Challenger

Can AMD opencl support 6GB of device memory?

Does anyone know if AMD opencl can support 6GB of physical device memory ?

Specifically, on a Tahiti 7970 single GPU card with 6GB of DDR5 memory.

One concern is that opencl uses 32 bit pointers. There is a 48 bit base address in the

128 bit memory descriptors, but is it used?, how does it work with 32 bit pointers?

The other is that the performance of the higher memory is degraded by page swapping

or some other system mechanism.

(Yes, I need more memory !)

Many thanks for any input.

0 Likes
1 Solution
german
Staff

AMD OpenCL compiler and runtime implementations require single linear address space for the buffer allocations. So the kernel's binary (ISA) needs 64bit memory access and 64 bit arithmetic for the address calculation in order
to support >4GB of the buffer allocations. Setting environment variable "set GPU_FORCE_64BIT_PTR=1" can force compiler/runtime to generate an ISA that supports 64 bit address calculation and memory access. Hence that
allows OpenCL runtime to report all 6GB of local memory. The main performance issue comes from 64-bit address calculation and not from memory  access.

48 bit address in memory descriptor mainly used for images. Images have per texel access with
coordinates and don't really require allocations in a single linear address space. When OpenCL runtime
reports supported memory size, it doesn't know if the application is going to allocate 6GB of buffers or images.

View solution in original post

4 Replies
sh2
Adept II

I don't know for sure, I think env. var "set GPU_FORCE_64BIT_PTR=1" could help you. CL_DEVICE_ADDRESS_BITS equals to  64 if this var is set.

german
Staff

AMD OpenCL compiler and runtime implementations require single linear address space for the buffer allocations. So the kernel's binary (ISA) needs 64bit memory access and 64 bit arithmetic for the address calculation in order
to support >4GB of the buffer allocations. Setting environment variable "set GPU_FORCE_64BIT_PTR=1" can force compiler/runtime to generate an ISA that supports 64 bit address calculation and memory access. Hence that
allows OpenCL runtime to report all 6GB of local memory. The main performance issue comes from 64-bit address calculation and not from memory  access.

48 bit address in memory descriptor mainly used for images. Images have per texel access with
coordinates and don't really require allocations in a single linear address space. When OpenCL runtime
reports supported memory size, it doesn't know if the application is going to allocate 6GB of buffers or images.

Two correct answers so fast! I see Jive was not designed for that.

Many thanks to both.

0 Likes

Hi German. Thank you for the explanation, but it's still not quite clear what exactly the variable does, especially when kernels are compiled offline. Is this variable affecting compilation, runtime or both? What happens if the variable was set at compile time and is not set at run time (or the other way around)? What happens if host application is a 32-bit one? (so sizeof(size_t) on CPU will be 4).

0 Likes