cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

jaap_nhl_lwd
Journeyman III

OpenCL compiler crashes, example with 2 line kernel

I have problems compiling my some of my kernels for AMD GPUs, no problems with NVDIA OpenCL.

Used platform OpenCL 1.2 AMD-APP (1084.4) Juniper HD5700 Win7-64.

Minimal kernel extracted from bigger one as example:

kernel void AMDBug (const uint xx, global uint *gg) {
   uint x = get_global_id(0) + xx;
   if (x < xx) *gg = 1;
}

Compiler crashed when compiling for GPU, no problem for CPU.

Last message before crash:

LLVM ERROR: Cannot select: 0xb1e670: i8 = setcc <lots of hex codes>

Please your advice.

Best regards, Jaap.

0 Likes
1 Solution

@Jaap....,

A related question. I want to use my software on different OpenCL platforms like AMD, NVIDIA and Intel.

My host code can only link with one version of OpenCL.lib.

Do you expect problems linking with AMD's OpenCL.lib and executing NVIDIA or Intel OpenCL platform?

This is where ICD come to play.

Just think how "clGetPlatformIds" will work, if there are multiple platforms?

Although you link with AMD's opencl Lib -- the ICD will make sure that it detects all other platforms involved and will route your API calls correctly.

On Linux, I know for a fact that OpenCL drivers register themselves at /etc/OpenCL/vendors/

The ICD files present there are just plain text files that contain the name of the shared-library that implements the platform (icd)'s runt-ime. These files can be read at run-time, platform DLLs located and opened and queried.

On windows - Registry values @ HKEY_LOCAL_MACHINE\SOFTWARE\Khronos\OpenCL\Vendors hold DLL file names that implement the platform

Note that OpenCl extension provides APIs to provide function addresses of a platform (clGetExtensionFunctionAddress)

You can check OpenCL extensions PDF for more details.

Hope this helps,

View solution in original post

0 Likes
20 Replies
himanshu_gautam
Grandmaster

Will try this next week.

Meanwhile - if it is not a hassle - can you post your little sample over here as a zip file or so.

0 Likes

See attached zip file for

- test_AMD.jls script with host code

- bugAMD.cl file with kernel

In order to reproduce the bug, you can download the demo for VisionLab (development environment for image processing applications) from www.vdlmv.nl/download .

Select demo version without camera. Please read notes for installation. A one minute job.

Copy both zipped files to downloads.

Execute VisionLab.exe.

Open file  test_AMD.jls and click Execute button.

0 Likes

Sure, Thanks for posting.

iirc, The version 1084 belongs to an old driver version (pulling from memory). The latest driver version is 13.1

If you are running old version, I would suggest to first upgrade.

0 Likes

I use Catalyst version 13.1.

0 Likes

Hi,

I don't see how that kernel might be responsible for causing a crash in the application. There might be a lot many causes in the application itself.

Anyways I tried to run your kernel using some simple host code. And I did not see any crash. The unusual stuff about the kernel being that many threads are writing into the same location(depending on the value of xx). I would suggest you to try to reproduce your issue using a minimal testcase if possible.

Appreciate your efforts here.

0 Likes

Thank you for your reaction.

Some remarks:

- The  AMDBug kernel is just the smallest kernel that replicated the crash of the OpenCL compiler. It is not intended to do anything usefull.

- Many other kernels compile and run well on the Juniper HD5700 Win7-64 system.

- The AMDBug kernel and the full original kernel on the Juniper HD5700 Win7-64 system compile and run well on the CPU.

- Compiling and running the  AMDBug kernel with exactly the same host code on a  HD7730M Win8-64 laptop without problems.

- I have reinstalled Catalyst version 13.1 on the Juniper HD5700 Win7-64 system. But the problems are persistent.

0 Likes

This is how i tried to run your kernel using a simple host code. This code works as expected.

Maybe you can try to edit it, to create your testcase.

Himanshu, thank you for your example.

It run successfull on Juniper HD5700 Win7-64.

The only difference I can find is that I used on older version of OpenCL.lib.

Do you think this can cause the trouble?

A related question. I want to use my software on different OpenCL platforms like AMD, NVIDIA and Intel.

My host code can only link with one version of OpenCL.lib.

Do you expect problems linking with AMD's OpenCL.lib and executing NVIDIA or Intel OpenCL platform?

0 Likes

@Jaap....,

A related question. I want to use my software on different OpenCL platforms like AMD, NVIDIA and Intel.

My host code can only link with one version of OpenCL.lib.

Do you expect problems linking with AMD's OpenCL.lib and executing NVIDIA or Intel OpenCL platform?

This is where ICD come to play.

Just think how "clGetPlatformIds" will work, if there are multiple platforms?

Although you link with AMD's opencl Lib -- the ICD will make sure that it detects all other platforms involved and will route your API calls correctly.

On Linux, I know for a fact that OpenCL drivers register themselves at /etc/OpenCL/vendors/

The ICD files present there are just plain text files that contain the name of the shared-library that implements the platform (icd)'s runt-ime. These files can be read at run-time, platform DLLs located and opened and queried.

On windows - Registry values @ HKEY_LOCAL_MACHINE\SOFTWARE\Khronos\OpenCL\Vendors hold DLL file names that implement the platform

Note that OpenCl extension provides APIs to provide function addresses of a platform (clGetExtensionFunctionAddress)

You can check OpenCL extensions PDF for more details.

Hope this helps,

0 Likes

@Himanshu, thanks for your answers and help.

0 Likes

@Himanshu, your example code uses the CPU instead of GPU. I have changed:

    context = clCreateContextFromType(cps,

//                                    CL_DEVICE_TYPE_CPU,

                                      CL_DEVICE_TYPE_GPU,

                                      NULL,

                                      NULL,

                                      &status);

With this change your example program crashes on my Juniper HD5700 Win7-64 system with the same error message as my program. On other systems with different GPUs I do not have problems.

My conclusion is that either I have a problem with the OpenCL installation on that system or there is a problem in the backend of the Juniper HD5700 compiler.

0 Likes

Your kernel can be completely optimized by the frontend compiler, because it does nothing. As far as I remember SC couldn't accept kernels without any output. So the backend compiler has to detect that case and generate a fixed fake write. Probably there is a case where the detection didn't work properly and had a crash. A simple "else" statement should fix your problem:
if (x < xx) *gg = 1;

else *gg =0;

Or just add some code that actually produces something.

0 Likes

Thank you for your suggestion.

The mini kernel with the bug was extracted from larger kernel. The offending code was part of a loop in a else branch.

I extended the minmal kernel with your else branch, but the error was persistent.

0 Likes

So if you link your app with the latest OpenCL.lib, then the crash is gone. Is that correct?

I doubt opencl.lib can fix a crash in the compiler. 

0 Likes

Please see my last reaction in the Himanshu branch of this discussion.

0 Likes
nou
Exemplar

copy paste this code into amd kernel analyzer cause this error.

LLVM ERROR: Cannot select: 0x7f0c601926c0: i8 = setcc 0x7f0c60248ae0, 0x7f0c601927c0, 0x7f0c601924c0 [ID=12]

  0x7f0c60248ae0: i32 = AMDILISD::ADD 0x7f0c601927c0, 0x7f0c602485e0 [ID=11]

    0x7f0c601927c0: i32 = AMDILISD::VEXTRACT 0x7f0c602489e0, 0x7f0c602486e0 [ORD=2] [ID=10]

      0x7f0c602489e0: v4i32,ch = llvm.AMDIL.get.global.id 0x7f0c60242160, 0x7f0c602488e0 [ORD=1] [ID=9]

        0x7f0c602488e0: i32 = TargetConstant<2561> [ORD=1] [ID=2]

      0x7f0c602486e0: i32 = TargetConstant<1> [ORD=2] [ID=7]

    0x7f0c602485e0: i32,ch = CopyFromReg 0x7f0c60242160, 0x7f0c602484e0 [ORD=3] [ID=8]

      0x7f0c602484e0: i32 = Register %vreg0 [ORD=3] [ID=1]

  0x7f0c601927c0: i32 = AMDILISD::VEXTRACT 0x7f0c602489e0, 0x7f0c602486e0 [ORD=2] [ID=10]

    0x7f0c602489e0: v4i32,ch = llvm.AMDIL.get.global.id 0x7f0c60242160, 0x7f0c602488e0 [ORD=1] [ID=9]

      0x7f0c602488e0: i32 = TargetConstant<2561> [ORD=1] [ID=2]

    0x7f0c602486e0: i32 = TargetConstant<1> [ORD=2] [ID=7]

0 Likes

The crash will be fixed with the new releases. When LLVM compiled SADDO instruction it didn't legalize setcc. AMDIL doesn't support i8 type in the result (i8 = setcc) and LLVM has to make a conversion to legal i32.

0 Likes
drallan
Challenger

I have problems compiling my some of my kernels for AMD GPUs, no problems with NVDIA OpenCL.

Used platform OpenCL 1.2 AMD-APP (1084.4) Juniper HD5700 Win7-64.

Minimal kernel extracted from bigger one as example:

kernel void AMDBug (const uint xx, global uint *gg) {
   uint x = get_global_id(0) + xx;
   if (x < xx) *gg = 1;
}

Compiler crashed when compiling for GPU, no problem for CPU.

Last message before crash:

LLVM ERROR: Cannot select: 0xb1e670: i8 = setcc <lots of hex codes>

This is very familiar. I reported almost the same problem here in November starting about Catalyst 12.11.

http://devgurus.amd.com/message/1285016#1285016

In that case, it was a compiler error triggered by the expression  c = hi + (x < c),  which dumped the same kind of LLVM error message. The compiler erred on the conditional x < c, which is also in the current code. Most other conditionals like x <= c. worked fine.

I would love to know what happens if you use  if (x <= xx) *gg = 1 instead.

Or try if(x <= xx-1) to be logically the same.

0 Likes

@drallan:

No crash:    if (x <= xx) *gg = 1;

No crash:    if (x < (xx-1)) *gg = 1;

Crash:   if (x < xx) *gg = 1;

0 Likes

Will look into this... Please wait. Thanks!

--EDIT--

Oh, I see german has already posted on this. So, it will be fixed in a future release.

0 Likes