There is a max buffer allocation limit (maximum memory allocation in clinfo) and its different for different devices. Can anyone explain why do we have this constraint?
Moreover, if my maximum memory allocation limit is 200540160, then i can allocate a 128MB buffer of unsigned ints (33554432*4 bytes). Now if I initialize another buffer of the same size, I will have to transfer it to GPU memory once the processing of 1st buffer is complete?
Is there a better alternative solution because i have an input array of tera bytes and breaking that into 128MB chunks and processing it one after the other is very slow.