AnsweredAssumed Answered

OCL compile error

Question asked by Bdot on Mar 16, 2013
Latest reply on Apr 8, 2013 by Bdot

Hi, I'm running Win7/64, HD5770, Catalyst 13.1. When my program compiles I receive this output when targeting the GPU (runs fine on CPU):

 

Select device - OpenCL Platform 1/1: Advanced Micro Devices, Inc., Version: OpenCL 1.2 AMD-APP (1084.4)

Get device info - Device 1/1: Juniper (Advanced Micro Devices, Inc.),

device version: OpenCL 1.2 AMD-APP (1084.4), driver version: 1084.4 (VM)

Extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d10_sharing

Global memory:1073741824, Global memory cache: 0, local memory: 32768, workgroup size: 256, Work dimensions: 3[256, 256, 256, 0, 0] , Max clock speed:960, compute units:10

Compiling kernels (build options: "-I. -DVECTOR_SIZE=2 -g -DMORE_CLASSES -DCL_GPU_SIEVE").LLVM ERROR: Cannot select: 0x8660700: i8 = setcc 0x8655250, 0x77c6140, 0x8659990 [ID=58] dbg:barrett.cl:5169:39

  0x8655250: i32 = AMDILISD::ADD 0x77c6140, 0x77b5530 [ID=55] dbg:barrett.cl:5169:39

    0x77c6140: i32,ch = llvm.AMDIL.mulhi.u32 0x5e14070, 0x864b1b0, 0x8660500, 0x77c4220 [ORD=221913] [ID=48]

      0x864b1b0: i32 = TargetConstant<2674> [ORD=221906] [ID=17]

      0x8660500: i32,ch = llvm.AMDIL.mad24.u32 0x5e14070, 0x865e6e0, 0x865e1e0, 0x8656460, 0x8664440 [ORD=221904] [ID=44]

        0x865e6e0: i32 = TargetConstant<2623> [ORD=221904] [ID=13]

        0x865e1e0: i32 = Constant<4620> [ORD=221904] [ID=14]

        0x8656460: i32,ch = llvm.AMDIL.mul24.u32 0x5e14070, 0x77be2c0, 0x77b4c20, 0x8665850 [ORD=221900] [ID=39]

          0x77be2c0: i32 = TargetConstant<2666> [ORD=221900] [ID=9]

          0x77b4c20: i32,ch = CopyFromReg 0x5e14070, 0x77c2d00 [ORD=221900] [ID=32]

            0x77c2d00: i32 = Register %vreg1776 [ORD=221900] [ID=10]

          0x8665850: i32 = AMDILISD::VEXTRACT 0x77c07e0, 0x865d4d0 [ORD=221899] [ID=36]

            0x77c07e0: v4i32,ch = llvm.AMDIL.get.group.id 0x5e14070, 0x865d9d0 [ORD=221898] [ID=31]

              0x865d9d0: i32 = TargetConstant<2564> [ORD=221898] [ID=8]

            0x865d4d0: i32 = TargetConstant<1> [ORD=221903] [ID=26]

        0x8664440: i32 = AMDILISD::VEXTRACT 0x8660e00, 0x865d4d0 [ORD=221903] [ID=41]

          0x8660e00: v2i32,ch = load 0x5e14070, 0x864ece0, 0x8661110<LD8[%arrayidx_v4397]> [ORD=221902] [ID=37]

            0x864ece0: i32,ch = CopyFromReg 0x5e14070, 0x865f1f0 [ORD=221901] [ID=33]

              0x865f1f0: i32 = Register %vreg1774 [ORD=221901] [ID=11]

            0x8661110: i32 = undef [ORD=221902] [ID=12]

          0x865d4d0: i32 = TargetConstant<1> [ORD=221903] [ID=26]

      0x77c4220: i32,ch = CopyFromReg 0x5e14070, 0x8664240 [ORD=221896] [ID=30] dbg:barrett.cl:5156:51

        0x8664240: i32 = Register %vreg1773 [ORD=221896] [ID=6]

    0x77b5530: i32 = and 0x8658a80, 0x77c4220 [ORD=221916] [ID=53] dbg:barrett.cl:5169:39

      0x8658a80: i32 = setcc 0x77c5a30, 0x77c5430, 0x62debd0 [ID=51] dbg:barrett.cl:5162:128

        0x77c5a30: i32 = AMDILISD::ADD 0x77bfad0, 0x8661e10 [ORD=221907] [ID=46] dbg:barrett.cl:5162:128

          0x77bfad0: i32,ch = llvm.AMDIL.mulhi.u32 0x5e14070, 0x864b1b0, 0x865e1e0, 0x8656460 [ORD=221906] [ID=43]

            0x864b1b0: i32 = TargetConstant<2674> [ORD=221906] [ID=17]

            0x865e1e0: i32 = Constant<4620> [ORD=221904] [ID=14]

            0x8656460: i32,ch = llvm.AMDIL.mul24.u32 0x5e14070, 0x77be2c0, 0x77b4c20, 0x8665850 [ORD=221900] [ID=39]

              0x77be2c0: i32 = TargetConstant<2666> [ORD=221900] [ID=9]

              0x77b4c20: i32,ch = CopyFromReg 0x5e14070, 0x77c2d00 [ORD=221900] [ID=32]

                0x77c2d00: i32 = Register %vreg1776 [ORD=221900] [ID=10]

              0x8665850: i32 = AMDILISD::VEXTRACT 0x77c07e0, 0x865d4d0 [ORD=221899] [ID=36]

                0x77c07e0: v4i32,ch = llvm.AMDIL.get.group.id 0x5e14070, 0x865d9d0 [ORD=221898] [ID=31]

                  0x865d9d0: i32 = TargetConstant<2564> [ORD=221898] [ID=8]

                0x865d4d0: i32 = TargetConstant<1> [ORD=221903] [ID=26]

          0x8661e10: i32 = AMDILISD::VEXTRACT 0x8660e00, 0x77c4a20 [ORD=221905] [ID=40]

            0x8660e00: v2i32,ch = load 0x5e14070, 0x864ece0, 0x8661110<LD8[%arrayidx_v4397]> [ORD=221902] [ID=37]

              0x864ece0: i32,ch = CopyFromReg 0x5e14070, 0x865f1f0 [ORD=221901] [ID=33]

                0x865f1f0: i32 = Register %vreg1774 [ORD=221901] [ID=11]

              0x8661110: i32 = undef [ORD=221902] [ID=12]

            0x77c4a20: i32 = TargetConstant<2> [ORD=221905] [ID=27]

        0x77c5430: i32 = setcc 0x8660500, 0x8664440, 0x8659990 [ID=47] dbg:barrett.cl:5162:128

          0x8660500: i32,ch = llvm.AMDIL.mad24.u32 0x5e14070, 0x865e6e0, 0x865e1e0, 0x8656460, 0x8664440 [ORD=221904] [ID=44]

            0x865e6e0: i32 = TargetConstant<2623> [ORD=221904] [ID=13]

            0x865e1e0: i32 = Constant<4620> [ORD=221904] [ID=14]

            0x8656460: i32,ch = llvm.AMDIL.mul24.u32 0x5e14070, 0x77be2c0, 0x77b4c20, 0x8665850 [ORD=221900] [ID=39]

              0x77be2c0: i32 = TargetConstant<2666> [ORD=221900] [ID=9]

              0x77b4c20: i32,ch = CopyFromReg 0x5e14070, 0x77c2d00 [ORD=221900] [ID=32]

                0x77c2d00: i32 = Register %vreg1776 [ORD=221900] [ID=10]

              0x8665850: i32 = AMDILISD::VEXTRACT 0x77c07e0, 0x865d4d0 [ORD=221899] [ID=36]

                0x77c07e0: v4i32,ch = llvm.AMDIL.get.group.id 0x5e14070, 0x865d9d0 [ORD=221898] [ID=31]

                  0x865d9d0: i32 = TargetConstant<2564> [ORD=221898] [ID=8]

                0x865d4d0: i32 = TargetConstant<1> [ORD=221903] [ID=26]

            0x8664440: i32 = AMDILISD::VEXTRACT 0x8660e00, 0x865d4d0 [ORD=221903] [ID=41]

              0x8660e00: v2i32,ch = load 0x5e14070, 0x864ece0, 0x8661110<LD8[%arrayidx_v4397]> [ORD=221902] [ID=37]

                0x864ece0: i32,ch = CopyFromReg 0x5e14070, 0x865f1f0 [ORD=221901] [ID=33]

                  0x865f1f0: i32 = Register %vreg1774 [ORD=221901] [ID=11]

                0x8661110: i32 = undef [ORD=221902] [ID=12]

              0x865d4d0: i32 = TargetConstant<1> [ORD=221903] [ID=26]

          0x8664440: i32 = AMDILISD::VEXTRACT 0x8660e00, 0x865d4d0 [ORD=221903] [ID=41]

            0x8660e00: v2i32,ch = load 0x5e14070, 0x864ece0, 0x8661110<LD8[%arrayidx_v4397]> [ORD=221902] [ID=37]

              0x864ece0: i32,ch = CopyFromReg 0x5e14070, 0x865f1f0 [ORD=221901] [ID=33]

                0x865f1f0: i32 = Register %vreg1774 [ORD=221901] [ID=11]

              0x8661110: i32 = undef [ORD=221902] [ID=12]

            0x865d4d0: i32 = TargetConstant<1> [ORD=221903] [ID=26]

      0x77c4220: i32,ch = CopyFromReg 0x5e14070, 0x8664240 [ORD=221896] [ID=30] dbg:barrett.cl:5156:51

        0x8664240: i32 = Register %vreg1773 [ORD=221896] [ID=6]

  0x77c6140: i32,ch = llvm.AMDIL.mulhi.u32 0x5e14070, 0x864b1b0, 0x8660500, 0x77c4220 [ORD=221913] [ID=48]

    0x864b1b0: i32 = TargetConstant<2674> [ORD=221906] [ID=17]

    0x8660500: i32,ch = llvm.AMDIL.mad24.u32 0x5e14070, 0x865e6e0, 0x865e1e0, 0x8656460, 0x8664440 [ORD=221904] [ID=44]

      0x865e6e0: i32 = TargetConstant<2623> [ORD=221904] [ID=13]

      0x865e1e0: i32 = Constant<4620> [ORD=221904] [ID=14]

      0x8656460: i32,ch = llvm.AMDIL.mul24.u32 0x5e14070, 0x77be2c0, 0x77b4c20, 0x8665850 [ORD=221900] [ID=39]

        0x77be2c0: i32 = TargetConstant<2666> [ORD=221900] [ID=9]

        0x77b4c20: i32,ch = CopyFromReg 0x5e14070, 0x77c2d00 [ORD=221900] [ID=32]

          0x77c2d00: i32 = Register %vreg1776 [ORD=221900] [ID=10]

        0x8665850: i32 = AMDILISD::VEXTRACT 0x77c07e0, 0x865d4d0 [ORD=221899] [ID=36]

          0x77c07e0: v4i32,ch = llvm.AMDIL.get.group.id 0x5e14070, 0x865d9d0 [ORD=221898] [ID=31]

            0x865d9d0: i32 = TargetConstant<2564> [ORD=221898] [ID=8]

          0x865d4d0: i32 = TargetConstant<1> [ORD=221903] [ID=26]

      0x8664440: i32 = AMDILISD::VEXTRACT 0x8660e00, 0x865d4d0 [ORD=221903] [ID=41]

        0x8660e00: v2i32,ch = load 0x5e14070, 0x864ece0, 0x8661110<LD8[%arrayidx_v4397]> [ORD=221902] [ID=37]

          0x864ece0: i32,ch = CopyFromReg 0x5e14070, 0x865f1f0 [ORD=221901] [ID=33]

            0x865f1f0: i32 = Register %vreg1774 [ORD=221901] [ID=11]

          0x8661110: i32 = undef [ORD=221902] [ID=12]

        0x865d4d0: i32 = TargetConstant<1> [ORD=221903] [ID=26]

    0x77c4220: i32,ch = CopyFromReg 0x5e14070, 0x8664240 [ORD=221896] [ID=30] dbg:barrett.cl:5156:51

      0x8664240: i32 = Register %vreg1773 [ORD=221896] [ID=6]

What does that mean, and how do I avoid it?

 

Hmm, when I just tried it again with the packed-up zip, it fails in the CPU as well (run 'mfakto -d c' to let it choose the CPU), but with this error:

Select device - (CPU) - OpenCL Platform 1/1: Advanced Micro Devices, Inc., Version: OpenCL 1.2 AMD-APP (1084.4)

Get device info - Device 1/1: AMD Phenom(tm) II X4 955 Processor (AuthenticAMD),

device version: OpenCL 1.2 AMD-APP (1084.4), driver version: 1084.4 (sse2)

Extensions: cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_device_fission cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d10_sharing

Global memory:4293038080, Global memory cache: 65536, local memory: 32768, workgroup size: 1024, Work dimensions: 3[1024, 1024, 1024, 0, 0] , Max clock speed:3208, compute units:4

Compiling kernels (build options: "-I. -DVECTOR_SIZE=2 -g -DMORE_CLASSES -DCL_GPU_SIEVE").

        BUILD OUTPUT

".\barrett.cl", line 665: warning: statement is unreachable

    nn.d0  = n.d0 * qi;

    ^

 

".\barrett.cl", line 987: warning: statement is unreachable

    nn.d0  = n.d0 * qi;

    ^

 

".\barrett.cl", line 4975: warning: variable "exp96" was declared but never

          referenced

    __private int96_t  exp96, my_k_base, f_base;

                       ^

 

".\barrett.cl", line 4976: warning: variable "f" was declared but never

          referenced

    __private int96_v  a, u, f;

                             ^

 

"C:\Users\Bertram\AppData\Local\Temp\OCL6DE4.tmp.cl", line 536: warning:

          variable "as" was declared but never referenced

    int90_v a, as, b, r, m;

               ^

 

"C:\Users\Bertram\AppData\Local\Temp\OCL6DE4.tmp.cl", line 536: warning:

          variable "m" was declared but never referenced

    int90_v a, as, b, r, m;

                         ^

 

Internal Error:  ld failed

 

        END OF BUILD OUTPUT

Error -11: clBuildProgram

init_CL(3, -1) failed


I'm sure it worked on the CPU, but I may have changed one of the kernels a bit ... Just "ld failed" is not a lot to work with ...

Anyway, I'm attaching the zip.

Attachments

Outcomes