Currently i'm running Brook+ program on Vista with catalyst installed.
In the user_guide, it says use -r to disable address virtualization, does it mean that without -r address virtualization is enable?? And with the current 9.5 version catalyst, is there support on larger one array which is large than 8192.
If there is , when the large one-d array is utilized, what's the performance comparing with the 1d array stored in two dimension with manually controlling.
With -r address translation (AT) is disabled, i.e. you can't use 3D streams or 1D streams of size > 8192.
I think with Catalyst 9.5, AT (only 1D streams > 8192) is not working becuase of a regression. But, you can go back to previous versions to make it work.
AT has some performance overhead in terms of some extra ALU operations in kernel (calculation to convert 1D stream address to 2D buffer address and vice-versa), but usually these calculation should not be an overhead as we are almost never ALU limited in kernel.
Of course, you can save any un-necessary calculation if you control it manually and compare the performance difference between Brook+ AT vs your manual code.