cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

jpsollie
Adept II

clBuildProgram causes BRIG validation error

Hi Everyone,

So, I'll first post my system:

hardware:

2x opteron 6276, 128GB ram, combined with R9 nano

software:

-linux 4.10.17 x64

-LLVM 4.0.1 & 5.0.0 (git)

-amd 17.30 opencl framework.

problem:

(I narrow down the config to only load the amdgpu-pro icd file)

when I create a program which compiles on the CPU, it works fine,

when doing this on the GPU, it crashes with the following error:

---------------------------------------------------------------------

Error in hsa_operand section, at offset 121368:

Address is outside of memory allocated for variable

LLVM ERROR:

Brig container validation has failed in BRIGAsmPrinter.cpp

------------------------------------------------------------------------

what I tried:

- Mesa OpenCL (compiles, but does not show a correct result)

- pocl & llvm 5.0.0 (works perfectly)

- amdgpu-pro CPU driver (2348.3) (works perfectly)

-amdgpu-pro GPU driver (2442.7) same error as 2348, but does not show a CPU ...

*edit:

also tried oclGrind and CodeXL, no problem there

I suspected the error to be somewhere with LLVM, but I already switched the PATH and LD_LIBRARY_PATH to point to LLVM 4, but it does not present any change.

Where does this error come from? and how do I fix it?

thanks

0 Likes
1 Solution

First of all, thanks for sharing the repro code. After a quick test, it looks like a compiler optimization issue. The kernel seems building fine if optimization is disable i.e set to "-O0" . I'll check a further on another setup and report to the compiler team, if required. Meanwhile could you please try the same and share your observation.

Regarding the atomic query, I would suggest to open a new thread as it seems unrelated one. Also, it would help us to track these two issues separately. Please share the repro code and other setup details on that thread itself.

Regards,

View solution in original post

0 Likes
11 Replies
dipak
Big Boss

Hi,

Please provide the repro code for our investigation. Also, please share the clinfo output and OS information.

Hope, this is the driver where you observed the error: AMDGPU-PRO Driver for Linux Release Notes

Regards,

0 Likes

no problem, here you go

what do you want to know about my OS?

I know Gentoo Linux is not supported, and neither is kernel 4.10.17, but I do not want you to present me a solution, just maybe ... maybe ... you guys know more about BRIG/ HSAIL compilation than I do

0 Likes

Hi Dipak,

I got the code compiled (though I do not know why it works), but I saw the following at runtime debugging:

atom_inc(system) does not atomically increase the value of local uint system[0], whereas atom_xchg(system, system[0] + 1) does.  Do I need to open a new thread for this?

*edit:

I also saw this behaviour on clover running with LLVM 5.0

pocl 0.14 (which I use on the opteron CPUs) shows no difference, it runs on LLVM 4.0.1

does this look like an LLVM error? or is compiler related?

*edit2:

this piece of code:

            if(!output[14]) output[14] = system[0] + 1;
            atom_inc(system);
            if(!output[15]) output[15] = system[0];

outputs in gdb:

Breakpoint 1, worker (device_obj=0x609490) at ./engine.c:397

397                 if(answer[3] == 255) {

(gdb) print answer

$1 = {0, 0, 0, 0, 255, 276, 340, 804850955, 40962, 0, 0, 0, 0, 0, 1, 64}

(gdb) print answer[14]

$2 = 1

(gdb) print answer[15]

$3 = 64

the fact that clover also has this issue looks like an LLVM error, no? or am I mistaking?

0 Likes

First of all, thanks for sharing the repro code. After a quick test, it looks like a compiler optimization issue. The kernel seems building fine if optimization is disable i.e set to "-O0" . I'll check a further on another setup and report to the compiler team, if required. Meanwhile could you please try the same and share your observation.

Regarding the atomic query, I would suggest to open a new thread as it seems unrelated one. Also, it would help us to track these two issues separately. Please share the repro code and other setup details on that thread itself.

Regards,

0 Likes

Update:

A ticket has been opened against this issue. Once I've any update about it, I'll share with you.

Regards,

0 Likes

Hi,

Please find the below comments from compiler team which indicate that the error is in the source file itself. Changing "finalcount" array size from 2 to 4 seems building the kernel successfully.

---------------------------------------------                  -----------------------------------------------------------------------------

The error is in the program source. It defines finalcount array of a size 2 bytes and then reads 4 bytes from it:

void SHA1Final(private unsigned char digest[20], ctxarray* ctx, private uchar* ctxbuffer)
{
unsigned char finalcount[2];
...
SHA1Update(finalcount, 2, ctx, ctxbuffer);  /* Should cause a SHA1Transform() */

...

void SHA1Update(private const unsigned char* data, private const uchar len, ctxarray* ctx, private uchar* ctxbuffer)
{
uchar i, j;
j = (ctx->l1 >> 3) & 63;
atom_add(&(ctx->l1), len << 3);
if (((j + len) & 64) != 0) {
os_memcpy(&ctxbuffer, data, (i = 64-j));

...

void os_memcpy(private uchar* dest, private const uchar* src, const uchar amount) {
uchar j = 0, intamount = amount >> 2;
int* destination = (int*) dest;
const int* source = (const int*) src;
while(j < intamount) {
destination = source;

The HSAIL specification explicitely contanis a range check and we cannot omit it.

It also violates C standard (ISO/IEC 9899) in two ways:

  1. 1. J.2 Undefined behavior

— Conversion between two pointer types produces a result that is incorrectly aligned
(6.3.2.3).

  1. 2. J.2 Undefined behavior

  Addition or subtraction of a pointer into, or just beyond, an array object and an
integer type produces a result that does not point into, or just beyond, the same array
object (6.5.6).

---------------------------------------------                  -----------------------------------------------------------------------------

Regards,

Thanks Dipak!

0 Likes

I can see the declaration as below:

local uint system[0]

atom_add  uses 64-bit value and extension cl_khr_int64_base_atomics  to be enabled. Please try atomic_add instead for unsigned int.

0 Likes

tried, failed.

this is confusing, I tried to create everything with OpenCL 1.0, and this extension:

cl_khr_local_int32_base_atomics

tells to use atom_add :s

0 Likes

Okay. I thought, it was for OpenCL 1.2.

Hope, you are setting the "-cl-std=" flag corresponding to targeted OpenCL version. If no flag is set, by default, OpenCL 1.2 is assumed for kernel building.

Please share a test-case so I could check it at my end.

Regards,

0 Likes

I made a new topic for it: OpenCL atomic_add and atomic_inc not working correctly .  I suggest we work on in this one

to answer your question: no, I didn't, I added -cl-std=CL1.0 but the error is still there

0 Likes