AnsweredAssumed Answered

constant memory issue

Question asked by roger on Apr 10, 2013
Latest reply on Apr 12, 2013 by himanshu.gautam

Hi everyone,


i already posted this on Khronos forums since i'm currently working on a NVIDIA card ( soon i'll get a hd Radeon 7950 ), but i need to resolve this  and nvidia removed their opencl forums....


here is my problem :


My code work fine if a int buffer ( 2 elements ) is in global memory (__global) but when i swap it to constant memory (__constant) the result become completely wrong.


The weird stuff is, i tried not to use this buffer in the kernel and the result is still wrong, with __global it works fine.


i dont really get it, why this space qualifier is blowing up my code even if i dont use the buffer.


here is the kernel signature PTX code :


.entry func(
   .param .align 4 .b8 func_param_0[52],
   .param .u32 func_param_1,
   .param .u32 func_param_2,
   .param .u32 .ptr .global .align 4 func_param_3,
   .param .u32 .ptr .global .align 64 func_param_4,
   .param .u32 .ptr .global .align 32 func_param_5,
   .param .u32 .ptr .global .align 16 func_param_6,
   .param .u32 .ptr .global .align 4 func_param_7,
   .param .u32 .ptr .const .align 4 func_param_8,
   .param .u32 .ptr .global .align 16 func_param_9,
   .param .u32 .ptr .global .align 16 func_param_10,
   .param .u32 .ptr .global .align 16 func_param_11,
   .param .u32 .ptr .global .align 4 func_param_12,
   .param .u32 .ptr .global .align 4 func_param_13,
   .param .u32 .ptr .global .align 4 func_param_14,
   .param .u32 .ptr .global .align 1 func_param_15,
   .param .u32 .ptr .global .align 4 func_param_16,
   .param .u32 .ptr .global .align 4 func_param_17,
   .param .u32 .ptr .global .align 16 func_param_18,
   .param .u32 .ptr .global .align 4 func_param_19,
   .param .u32 .ptr .global .align 16 func_param_20,
   .param .u32 .ptr .global .align 4 func_param_21,
   .param .u32 .ptr .global .align 16 func_param_22,
   .param .u32 .ptr .global .align 16 func_param_23



func_param_8 is const and mess completely the code even without using it.


if anyone has an idea ? i am using a gForce 560Ti.