Originally posted by: MicahVillmow The 8192x8192 limitation is hardware and should be different with future hardware.
Could be possible to emulate it fast and automatically in the driver? I really need to see all the graphics card's VRAM as a linear 1D pointer for the current HW generation... seriously.
Once this is done, I would like to see a Firestream with 4Gb, like the NVIDIA Tesla C1060. The mesh data I need to manage can occupy more than 2Gb... that or use a virtual VRAM memory system through the AGP/PCI like the CPU does.... I think Vista can virtualise the VRAM.
Originally posted by: MicahVillmow bubu, It is possible to get access to all the vram inside a kernel, but this requires using CAL, IL, and the global buffer(this gives you access to 32 bits of address space).
But... can I use CAL to allocate the buffer and, then, pass the pointer to Brook+? I don't like the idea to program kernels in pseudo-assembly language. For example, this is what I want:
byte* dataIn = (byte*)calAllocateBuffer(2*1024*1024);
byte* dataOut = (byte*)calAllocateBuffer(1024);
void myBrookPlusKernel ( byte *ptr )
const byte dIn = ptr;
dataOut[threadId] = dIn+10;
It is possible to write a kernel in Brook+, convert it to IL using the Brook+ compiler (brcc), and use the generated IL in CAL. Or better yet, put in the Brook+ kernel in ShaderAnalyzer and get the IL from there.
It is not as straightforward as what you asked, but it is a good middle ground where you get all CAL functionality and the ease of Brook+ kernel programming.
Originally posted by: udeepta@amd It is possible to write a kernel in Brook+, convert it to IL using the Brook+ compiler (brcc), and use the generated IL in CAL. Or better yet, put in the Brook+ kernel in ShaderAnalyzer and get the IL from there.
Ok... so imagine I need to access 256Mb of VRAM in a 512Mb Radeon.
For example, let's fill that data with a value of 0.5f ( I know, a dumb kernel ).
I should do this in Brook+ ( just a pseudo code 😞
void kernel ( float buff )
buff[(blockIdx.x*blockDim.x)+threadId.x] = 0.5f;
Then I compile it using Shaderanalyzer or whatever... so the AMD IL assembler is generated.
Then, In my C++ program I do:
//Allocate 256Mb using CAL
void *ptr = calMalloc(256*1024*1024);
//Load the kernel compiled and pass the VRAM pointer as IL input.
handle = calLoadKernel("c:\test\myKernel.il");
Your point is a smart one.
Writing kernel in Brook in order to get an assembly program and then using CAL for overall memory management seems to me the best compromise.
What is needed from AMD is a short application note (les than one page) to provide guidance on how to proceed.
Could a support guy from AMD take this as an action?
I'm sure this would be of great (and easy) help to overcome some of the current Brook limits.