I have been working with the latest SDK and see many examples like the following:
matrixGPU_double = static_cast<double*>(_aligned_malloc(SIZE, 4096));
My question is what is the reason using 4096? If I were using floats would I make that half the value? Also does this or should this change based upon the GPU and it's most optimal working set?
If it is not important would it not be easier to just use _malloc?
Hi Sorry for the late reply.. 4k alignment is required to minimize pinning costs This avoids the runtime having to pin/unpin on every map/unmap transfer, but does add to the total amount of pinned memory. you can go through the pinned memory concept to understand it better... The alignment parameter can change according to your required. Below link will give you better understanding about _aligned_malloc and its implementation http://jongampark.wordpress.com/2008/06/12/implementation-of-aligned-memory-alloc/