Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

Journeyman III

Constant arrays of structs are corrupted.

I'm writing a path tracer. I have the following data (see attached file for the context):

__constant Scene scene = {

        .planes = {

                {&mat_sky, {1000, 0, 0}, {-1, 0, 0}},

                {&mat_sky, {-1000, 0, 0}, {1, 0, 0}},

                {&mat_sky, {0, 1000, 0}, {0, -1, 0}},

                {&mat_sky, {0, -1000, 0}, {0, 1, 0}},

                {&mat_sky, {0, 0, 1000}, {0, 0, -1}},

                {&mat_sky, {0, 0, -1000}, {0, 0, 1}},


        .spheres = {

                {&mat_sun, {2, 1, 3}, .5f},

                {&mat_plastic, {5, 0, -3}, 2},

                {&mat_brushed_metal, {5, 0, 0}, 1},

                {&mat_mirror, {5, 4, 0}, 3},

                {&mat_glass, {5, -4, 0}, 3},



However, my code was segfaulting when trying to dereference some of the material fields. Setting a breakpoint in the kernel and inspecting the data showed that it was corrupted:

scene = {

        .planes = {

                {0x7fffed83f010 <mat_sky>, {1000, 0, 0}, {-1, 0, 0}},

                {0x0, {0, 0, 0}, {-5.10409426e+27, 4.59163468e-41, 0}},

                {0xc47a0000, {1, 0, 0}, {0, 0, 0}},

                {0x0, {-5.10409426e+27, 4.59163468e-41, 0}, {0, 1000, 0}},

                {0xbf80000000000000, {0, 0, 0}, {0, 0, 0}},

                {0x7fffed83f010 <mat_sky>, {0, -1000, 0}, {0, 1, 0}}


        .spheres = {

                {0x0, {0, 0, 0}, -5.10409426e+27},

                {0x0, {0, 0, -1}, 0},

                {0x0, {-5.10409426e+27, 4.59163468e-41, 0}, 0},

                {0x0, {0, 0, 0}, 0},

                {0x7fffed83f0b0 <mat_sun>, {2, 1, 3}, 0.5}



This corruption does not change from run to run. Using "-save-temps" I have verified that the assembler output contained the correct data. Interestingly, if I replace the kernel with a function that just prints out all the pointers in a loop, the output is correct, even though inspecting with GDB confirms that the data is still corrupt. I am on Funtoo Linux (amd64) with catalyst 13.1 (clinfo version: OpenCL 1.2 AMD-APP (1084.4)).

EDIT: Note that this kernel must be compiled for CPU; compiling it for the GPU (at least on my 7970) fails with "Internal Error: Link failed".

Message was edited by: Kyle Blake

16 Replies

I am not too sure if you can initialize a float3 using {0, 0, 0}. Can you initialize them as "(float3)(0,0,0)" instead?


Using the (float3)(x, y, z) syntax made no difference.


Interestingly, if I replace the kernel with a function that just prints out all the pointers in a loop, the output is correct, even though inspecting with GDB confirms that the data is still corrupt.

This actually tells me that GDB is not resolving the symbols correctly....The data values are all out there intact.

Do you also infer the same?


Is there a reason why you dont use CodeXL?


I didn't know it existed.

EDIT: It appears to be a replacement for gDEBugger. Both CodeXL and gDEBugger receive a segfault signal when running my kernel, and do not seem to provide useful information in that case.


I've tried adding print statements to the proper kernel now (it prints out whenever a trace hits an object), and it shows that the data is still corrupt; but not in the way GDB thinks:

plane[0] = {0x7f3d0d77a010, {1000, 0, 0}, {-1, 0, 0}}

plane[3] = {0x0, {0, 0, 0}, {0, 1000, 0}}

plane[5] = {0x7f3d0d77a010, {0, -1000, 0}, {0, 1, 0}}

sphere[3] = {0x0, {0, 0, 0}, 0}

(these are the only objects visible). plane[0] and plane[5] are correct.

Printing out the same values for all the planes in a loop before we start tracing (in addition to above) produces different (corrupted) results:

plane[0] = {0x7f3d0d77a010, {0, 0, 0}, {0, 0, 0}}

plane[1] = {0x7f3d0d77a010, {0, 0, 0}, {0, 0, 0}}

plane[2] = {0x7f3d0d77a010, {0, 0, 0}, {0, 0, 0}}

plane[3] = {0x7f3d0d77a010, {0, 0, 0}, {0, 0, 0}}

plane[4] = {0x7f3d0d77a010, {0, 0, 0}, {0, 0, 0}}

plane[5] = {0x7f3d0d77a010, {0, 0, 0}, {0, 0, 0}}

So apparently the constant data is different depending on where I reference it from even in the same run of the kernel.


One possible reason could be that your code is possibly corrupting the constant memory space.  Constant memory space is actually allocated in global memory. No OpenCL program can directly write to it because it will be caught as a compiler error. However, if your program accidentally points to this memory space (through some corruption) -- it can actually write into it......

Try commenting writes in your program and then print the values. THat might give you some clue...

Well, I am not ruling out a driver bug here... I just want to make sure that we plug all loop-holes before we look at the driver.


Also, you are not completely initializing some structures.

For example:

"bottom" field is not initialized in mat_sky object.

  __constant Material mat_sky = {

  .type = Single,

  .emmitance = {0, .5f, 1},

  .top = {

    .albedo = {0, .5f, 1},

    .roughness = 0,

    .isotropy = 1,




Commenting out writes didn't produce any change (thankfully there were not that many to check). The partial initialization of the materials is intentional -- the type field controls how much of the structure is actually used (e.g. mat_sky.bottom will never be referenced because mat_sky.type == Single). For the emittance field, I'm just relying on the implicit initialize-to-zero.



According to OpenCL Spec


Variables in the program scope (or) the outermost scope of kernel functions can be declared in the __constant address space. These variables are required to be initialized and the values used to initialize these variables must be a compile time constant. Writing to such a variable results in a compile-time error


I don't think there is any implicit initializations that will be done for you.

Can you initialize them fully? Not sure, what would be implementation semantics for a partially initialized __constant variable.

Also, If you think this is a problem, Can you upload a minimal test-case that I can share it with the driver team? Thanks!


Looking at the generated code, it appears to only zero-initialize fields before the last explicitly initialized one. Not a problem for this particular struct, but not something I should probably be relying on. I have attached a minimal testcase.


Thanks for the testcase. The problem was reproducible with the internal driver here too. I will forward this to relevant team.


Doing some more study of the kernel, it looks like it is a printf issue rather than __constant buffer getting corrupt.

PFA the code I tried. I am copying the constant buffer in a global buffer and then reading it back on host side. I am getting expected results for this case. Can you confirm it the attached code works for you too. I am forwarding the printf issue to AMD engineers anyway.


This code does work for me. However, changing it to copy the structs field by field instead of all at once causes it to fail. I.e. replacing:

gPlanes = planes;

with the (theoretically equivalent) code:

gPlanes.material = planes.material;
gPlanes.point = planes.point;
gPlanes.normal = planes.normal;

causes it to fail, with the last two fields being printed as all zeros.


Yeah right. This was also reproducible. I am sending another testcase to the AMD engineering team. Thanks for your support.


The implementation assumes that that the host and device have the same layout for this structure:

typedef struct {
  int material;
  cl_float3 point;
  cl_float3 normal;
} Plane;

But the implementation will match the structure layout of just the device compiler, not of the host compiler.  Since cl_float3 will need to align to a 16-byte boundary on the device, the size of the structure is 48 bytes at the device side, while the size of the structure is 36 byte at the host side.