18 Replies Latest reply on Jan 18, 2011 1:46 PM by bubu

    const floats in kernel not initialized

    XmainframeX

      Hello everybody,

      I just discovered that one of my kernels which ran fine with SDK 2.2 doesn't have its global const floats initialized with SDK 2.3:

      I'm debugging the CPU version of my kernel and the debugger shows me that the value of my global "const float centerFactor = 0.75f * 0.75f;" variable declared inside the .cl-File evaluates to 0.0f. The results of the kernels calculations support this observation.

      When i'm running the same kernel with SDK 2.2 the debugger shows that centerFactor evaluates to 0.5625 correctly and the calculations of the kernel are correct.

       

      Has anybody observerd similar behaviour or can reproduce this?

        • const floats in kernel not initialized
          rick.weber

          If it's global and not scoped to a particular kernel, you should declare it as __constant float centerFactor.

            • const floats in kernel not initialized
              XmainframeX

              Tried that - the problem persists.

              Just to make it clear I attached some skeletons of what I had before and what I changed:

              // Before: const float centerFactor = 0.75f * 0.75f; __kernel void ScaleUp2(__global float* dst, __global float* src, int srcWidth, int srcHeight, int dstWidth) { ... // centerFactor evaluates to 0.0f here ... } // Now: __constant float centerFactor = 0.75f * 0.75f; __kernel void ScaleUp2(__global float* dst, __global float* src, int srcWidth, int srcHeight, int dstWidth) { ... // centerFactor evaluates to 0.0f here ... }

              • const floats in kernel not initialized
                XmainframeX

                Tried something else: In the following code the debugger shows the three float variables switching from some random numbers to 0.0f when the numbers should be assigned.

                In some other kernels constants work like they should - same for SDK 2.2 - weird...

                __kernel void ScaleUp2(__global float* dst, __global float* src, int srcWidth, int srcHeight, int dstWidth) { float centerFactor = 0.5625f; float manhattenFactor = 0.1875f; float diagonalFactor = 0.0625f; // All three floats evaluate to 0.0f here... }

                  • const floats in kernel not initialized
                    himanshu.gautam

                    xmainframex,

                    I tried to reproduce it on my end but failed.

                    My system config is :
                    SDK 2.3
                    Driver 10.12
                    CPU AMD Athlon II x4 630 processor 2.81GHz
                    GPU HD 5770 (Juniper)
                    Os Windows Vista Ultimate 64 bit

                    I suggest you to send a precise testcase a streamdeveloper@amd.com.

                    Please mention your System configuration also.

                      • const floats in kernel not initialized
                        XmainframeX

                        Unfortunately I'm not able to create a precise testcase: While trying to find the significant difference between the kernels with properly working constants and the ones which fail, I transformed the failing kernel completely into one that works: Exactly the same CL-Code and the same order of building the program, creating the kernel and setting the arguments.

                        However, nothing changed: One of the two exactly equal kernels runs properly, the other does not. The only difference is that one kernel is build and run before the other - the working kernel is the second one.

                         

                        There are some more things I'd like to mention:

                        • Evaluating to the wrong value also happens for "written" constants, i.e. if I have a "__global float* dst" and I do "*dst = 0.5f;", the target of dst evaluates to 0.0f as well.
                        • It seems like not all floats evaluate to 0.0f: They are always rounded down to the next lower integer: 1.5f evaluates to 1 for example.
                        • The problem appears equally on GPU and CPU.

                         

                        That's my system config:

                        SDK 2.3
                        Driver 10.11
                        CPU Intel Core Duo P8700 (2x2.53 GHz)
                        GPU Mobility HD 4670 (RV730)
                        OS Gentoo Linux

                         

                        Maybe one of you can reproduce this problem now or make more qualified guesses about how to solve it than I can...

                          • const floats in kernel not initialized
                            himanshu.gautam

                            Please post the kernel you mentioned above.(which does not work the first time but works the second time.)

                            Can you try it on some newer GPU boards? I  will try to reproduce on the same family GPU you are using currently. 

                              • const floats in kernel not initialized
                                XmainframeX

                                @himanshu.gautam: Below is the kernel you requested together with the code I used for building and running it. It's (as its name says) for upscaling float greyscale pictures, so I guess it is clear what the buffers should contain and how big they should be.

                                Due to the fact that the three const float factors evaluate to 0.0f, the dst image is black after running the kernel - no matter what the src image contained.

                                 

                                Unfortunately I don't have other AMD/ATI GPU boards available. I guess it is easier to narrow down the problem on CPU anyway, isn't it?

                                 

                                /// Kernel code: /////////////////////// #ifdef __DEBUG #pragma OPENCL EXTENSION cl_amd_printf : enable #endif const float centerFactor = 0.75f * 0.75f; // Debugger shows that this is 0.0f on CPU const float manhattenFactor = 0.75f * 0.25f; // Debugger shows that this is 0.0f on CPU const float diagonalFactor = 0.25f * 0.25f; // Debugger shows that this is 0.0f on CPU __kernel void ScaleUp2(__global float* dst, __global float* src, int srcWidth, int srcHeight, int dstWidth) { int srcPosX = get_global_id(0); int srcPosY = get_global_id(1); if(srcPosX >= srcWidth) return; if(srcPosY >= srcHeight) return; int srcOffsXMin = max(0, srcPosX - 1); int srcOffsXMax = min(srcWidth - 1, srcPosX + 1); int srcOffsYMin = max(0, (srcPosY-1) * srcWidth); int srcOffsYMed = srcPosY * srcWidth; int srcOffsYMax = min((srcHeight-1) * srcWidth, (srcPosY+1) * srcWidth); float v11 = src[srcOffsYMin + srcOffsXMin]; float v12 = src[srcOffsYMin + srcPosX]; float v13 = src[srcOffsYMin + srcOffsXMax]; float v21 = src[srcOffsYMed + srcOffsXMin]; float v22 = src[srcOffsYMed + srcPosX]; float v23 = src[srcOffsYMed + srcOffsXMax]; float v31 = src[srcOffsYMax + srcOffsXMin]; float v32 = src[srcOffsYMax + srcPosX]; float v33 = src[srcOffsYMax + srcOffsXMax]; dst += 2 * srcPosY * dstWidth + 2 * srcPosX; *dst = diagonalFactor * v11 + manhattenFactor * v12 + manhattenFactor * v21 + centerFactor * v22; dst++; *dst = manhattenFactor * v12 + diagonalFactor * v13 + centerFactor * v22 + manhattenFactor * v23; dst += dstWidth - 1; *dst = manhattenFactor * v21 + centerFactor * v22 + diagonalFactor * v31 + manhattenFactor * v32; dst++; *dst = centerFactor * v22 + manhattenFactor * v23 + manhattenFactor * v32 + diagonalFactor * v33; } /// Build/run host code: /////////////////////// // Create program and kernel std::ifstream file((KernelPath + "ImageOps/ScaleUp2.cl").c_str()); fail(file.is_open(), "OpenClSift: ImageOps/ScaleUp2.cl not found"); std::string prog((std::istreambuf_iterator<char>(file)), std::istreambuf_iterator<char>()); cl::Program::Sources source(1, std::make_pair(prog.c_str(), prog.length() + 1)); pScaleUp = new cl::Program(env->Context, source); buildProgram(pScaleUp, env->Devices); kScaleUp = new cl::Kernel(*pScaleUp, "ScaleUp2"); kScaleUp->setArg(0, *dst); kScaleUp->setArg(1, *src); kScaleUp->setArg(2, srcWidth); kScaleUp->setArg(3, srcHeight); kScaleUp->setArg(4, 2 * srcWidth); cl::Event done; env->Queue.enqueueNDRangeKernel(*kScaleUp, cl::NullRange, CL_SUFFICE2D(cl::NDRange(srcWidth, srcHeight)), WorkgroupSize, NULL, &done); done.wait(); // dst buffer is completely 0.0f if I download it here...

                          • const floats in kernel not initialized
                            rick.weber

                            What is this "debugger" you mentioned? I thought only Nvidia had a GPU debugger and then only for CUDA applications...

                              • const floats in kernel not initialized
                                XmainframeX

                                @rick.weber: As I mentioned, the problem appears both on CPU and GPU. I could only debug on CPU and used gdb for that - created breakpoints at __OpenCL_KernelName_kernel functions after the kernel was created.

                                Due to the fact that the results of the kernel are equal (and wrong) on CPU and GPU, I expect the GPU kernel to suffer form the same problem.

                          • const floats in kernel not initialized
                            MicahVillmow
                            const float centerFactor = 0.75f * 0.75f; // Debugger shows that this is 0.0f on CPU <-- this is illegal OpenCL code. You cannot have globally scoped variables in the private address space.
                              • const floats in kernel not initialized
                                XmainframeX

                                @MicahVillmow: It all seems to become more and more inconsistent to me. What would be the expected behaviour when I try to run this code? According to some of the replies I'd guess that there should be a compiler error or so, but I observed silent errors or everything running fine so far (post from 01/14/2011 07:26 AM).

                                Btw., as I mentioned above declaring the floats inside of ScaleUp2 doesn't change anything - so how do all those observations fit together...!?

                              • const floats in kernel not initialized
                                MicahVillmow
                                I would expect it to error in the compiler. Since it is illegal code and the compiler doesn't catch it, the results are undefined and cannot be relied upon.
                                • const floats in kernel not initialized
                                  MicahVillmow
                                  I talked with our frontend engineer and the compiler will change the address space to constant. const itself is not a synonym for constant, but this was a common enough mistake that instead of emitting a compiler error, we emit a compiler warning and then make the change.