cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

mulisak
Journeyman III

Please help with clBuildProgram crash / segmentation fault

Hi, probably more for 'driver support' forum, but I would like here as well.

I have a problem with openCL program, clBuildProgram does segfault / crash with newer drivers.

Please help me improve the long time bad driver reputation AMD always had. The program was working fine with older drivers and older hw (5850).

I believe all 7xxx cards are affected with all 13.xx drivers.

Graphics Card

MSI R7850-1GD5/OC

and

GV-R795WF3-3GD

I believe all 7xxx cards are affected.

overclocking with/without - the same.

AMD Catalyst Driver Version, and Driver History

tested 12.2-12.10 works with tweaking,

13.4 and 13.6beta crashes

Operating System

Linux Fedora Core 16x64, 18x64,  19x64

Ubuntu 10.04 LTS and 12.04 LTS

Issue Details

OpenCL program using loops unrolling,sha256, ripemd160, elliptic curves, crashes / segfaults when calling clBuildProgram.

It can be tweaked so it compiles fine with 13.xx ati drivers, but then it produces bad results.

Motherboard or System Make & Model

ASUSTeK Computer INC. P8Z68-V LX

and

Gigabyte Technology Co., Ltd. Z77-D3H

Power Supply

ST-P0720PBA Seventeam Cilense 720w

and

Seasonic SS-500ET Active PFC T3

Display Device(s) and Connection(s) Used

HP ZR2740w

Applications and Games

opencl program - sha256, ripemd160, elliptic curves

CPU Details

Intel i5

and Intel i3

no overclocking

Motherboard BIOS Version

stock versions

System Memory Type & Amount

Kingston HyperX 4GB DDR 3 -

2x

and

1x

Additional Hardware

nothing special

Additional Details

no viruses in Linux,hw works well,memtest and Prime95 too.

---------------

Please help, there is another issue with new drivers, they even provide incorrect results,when offending crashing part is removed from code

I can assist with narrowing down the issue.

Thanks a lot.

best regards

.m.

0 Likes
13 Replies
himanshu_gautam
Grandmaster

Re: Please help with clBuildProgram crash / segmentation fault

You are posting on the right forum. Please provide us your kernel (attach it here as a thread). We will check out.

If required, we will raise a bug report internally.

0 Likes
mulisak
Journeyman III

Re: Please help with clBuildProgram crash / segmentation fault

Hi, please find enclosed description and crashing kernel.

Please do not hesitate to ask for clarification / amendments.

m.

zipped here :

http://www.filedropper.com/oclvanitygenproblem

0 Likes
himanshu_gautam
Grandmaster

Re: Please help with clBuildProgram crash / segmentation fault

sorry for checking this late, but the link provided is not working for us. Please use the forum itself for attaching the kernel instead of third party file hosting services.

0 Likes
mulisak
Journeyman III

Re: Please help with clBuildProgram crash / segmentation fault

Sorry, do not know, how to insert attachment, please find offending file here :

https://github.com/samr7/vanitygen/blob/master/calc_addrs.cl

The problem is now(latest drivers) only misbehaviour (not crash),it calculates incorrect GPU sum compared to CPU :

https://github.com/samr7/vanitygen/issues/19#issuecomment-22161310

but may be solved by downgrading AMDAPP SDK to 2.7 as somebody mentioned , will try ...

0 Likes
mulisak
Journeyman III

Re: Please help with clBuildProgram crash / segmentation fault

seems to be working well now with 13.8beta drivers and 3.10.5-201.fc19.x86_64, but I had to :

* install AMDAPP sdk 2.7

* manually overwrite all libs (x68,x64) with *ocl* name with those from app sdk

* ldconfig

* reboot

* run the app as a root

* disable 'quirks' (with '-S', probably just AMD_BFI_INT does not work with 7xxx cards)

tried with::

./oclvanitygen -d0 -v -v -S -i 1Lostxx ### wait about a minute

0 Likes
himanshu_gautam
Grandmaster

Re: Please help with clBuildProgram crash / segmentation fault


mulisak wrote:




* manually overwrite all libs (x68,x64) with *ocl* name with those from app sdk


* ldconfig


* reboot


* run the app as a root


* disable 'quirks' (with '-S', probably just AMD_BFI_INT does not work with 7xxx cards)



tried with::


./oclvanitygen -d0 -v -v -S -i 1Lostxx ### wait about a minute


What did you overwrite all libs with?

I guess you can give more information, as to what happens when you have latest APP SDK and catalyst 13.8beta. Also try using Kernel Analyzer for checking kernel compilations for different GPUs.

You can figure out if the compilation issue is specific to a GPU, or common for all generations. Also try modifying your kernel, and try make it compile, this can help you pinpoint the location of code that may be the culprit.

0 Likes
mulisak
Journeyman III

Re: Please help with clBuildProgram crash / segmentation fault

Hi, tested again with amd-driver-13.8-beta2 and amd sdk 2.8.1

when used under normal user -> coredump

[sq@private vanitygen.ori]$ ./oclvanitygen -S -i -v -v -d0 1Lostxx

Prefix difficulty:            941151070 1Lostxx

Difficulty: 941151070

Setting of real/effective user Id to 0/0 failed

Setting of real/effective user Id to 0/0 failed

Segmentation fault (core dumped)

when run under root -> incorrect GPU result compared to CPU.

I am sorry for my English, I do not know how to describe better. POST-BEFORE:: I unpacked AMD.APP.SDK.2.7 , took all libs (x86 && x64) that had *ocl* in their name, and overwrote those in system directories.

I am sorry I can not spend my time debugging your drivers anymore, I have to do paid work to pay my invoices.

It was working before with drivers 12.10 but is not working properly with 13.xx drivers - and is reported as such.

So please forward to developers.

Thanks a lot.

Best regards.

m.

0 Likes
zhaobaoabc
Journeyman III

Re: Please help with clBuildProgram crash / segmentation fault

how can i get it ?

0 Likes
himanshu_gautam
Grandmaster

Re: Please help with clBuildProgram crash / segmentation fault

I checked the discussion in github.

The patch was to disable AMD_BFI_INT

The "cl" code you have mentioned uses this define to utilize "cl_amd_meda_ops" function "amd_bytealign"

I wrote 2 kernels -- one which does bytealign() using amd_bytealign and other which does bytealign manually via software .

They both return the correct result...This is on Pitcairn device - 78xx device.

Here are the 2 kernels that I tested. They both give same results.

I checked it for 100000 iterations with various random numbers as "src0, src1 and src2"

I tested this on "uint" and not on vectors -- because I saw your code uses only scalar......


Can you run on this on your failing setup?

/*
    Testing amd_bytealign
     Build-in Function
      uintn  amd_bytealign (uintn src0, uintn src1, uintn src2)
    Description
      dst.s0 =  (uint) (((((long)src0.s0) << 32) | (long)src1.s0) >> ((src2.s0 & 3)*8))
      similar operation applied to other components of the vectors
*/
#pragma OPENCL EXTENSION cl_amd_media_ops : enable
__kernel void balignHW(uint src0, uint src1, uint src2, __global uint *result)
{
    result[0] = amd_bytealign(src0, src1, src2);
}

__kernel void balignSW(uint src0, uint src1, uint src2, __global uint *result)
{
    result[0] =  (uint) (((((long)src0) << 32) | (long)src1) >> ((src2 & 3)*8));
}

0 Likes