I have been working with the latest SDK and see many examples like the following:
matrixGPU_double = static_cast<double*>(_aligned_malloc(SIZE, 4096));
My question is what is the reason using 4096? If I were using floats would I make that half the value? Also does this or should this change based upon the GPU and it's most optimal working set?
If it is not important would it not be easier to just use _malloc?