cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

opello
Adept II

clinfo on GX-424CC, works first time then segfaults

Hello.  I'm trying to add OpenCL GPU support to an AMD GX-424CC based embedded system similar to the DB-FT3b-LC.

My target environment is x86_64 Linux 3.18.20 built with gcc 4.9.2 and glibc 2.19.

My initial testing environment is x86_64 Ubuntu 12.04.5 (precise) running Linux 3.13.0-46-generic with fglrx installed from *14.501-0ubuntu1_amd64.deb files generated using amd-driver-installer-14.501.1003-x86.x86_64.run.

The libraries in my target environment are currently those from the fglrx-core deb that I installed in Ubuntu.  This is because when using the files extracted from amd-driver-installer-14.501.1003-x86.x86_64.run directly (using --extract) I only ever saw clinfo detect the CPU.  I think this has to do with files I missed copying because they're in the usr/X11R6 directory after extracting (libatiadlxx.so, libatiuki.so.1).  I noticed another oddity between the generated Ubuntu debs and the files extracted from the source .run package:  the ones from the debs were stripped.  I don't think this should affect the functionality but I haven't yet ruled it out.  I should also say that I'm using 14.12 instead of 15.9 because I ran into more issues with 15.9 and my test environment was constructed when 14.12 was the latest.  I plan to investigate moving forward later on unless I need to sooner as a component of resolving these issues.

Each time I run clinfo from the target environment I get a soft lockup from clinfo for about 23 seconds with more than a few calls from the fglrx module in the stack trace.  I can share some example stack traces if desired.

When I run clinfo in my target environment it works only the very first time after a reboot, despite the following error:

<6>[fglrx] No ADL handler for Escape code 0x00110020

Subsequent runs all I get for output is "Segmentation fault" and the kernel log shows:

clinfo[105]: segfault at f73 ip 00007fe32145085b sp 00007ffff2ccdc50 error 4 in libamdocl64.so[7fe320dd0000+3840000]

The IP and SP vary but the library file name is consistently libamdocl64.so and the offset is consistently 3840000.

When clinfo does work the only variation I see is the Platform ID.  It's worth mentioning that I can unload and reload the fglrx module and it will work the first time after the module is loaded.  Reloading the module slightly changes the soft lockup behavior too.  There is a similar length delay (e.g. `time clinfo` reports 35s real time for the first run, and after reloading the driver reports 17s real time) and similar kernel log messages but no soft lockup after reloading the driver.

When running clinfo in my testing environment it behaves similarly with respect to soft lockups, stack traces, and run time reported by `time` but the ADL handler message and the segfault never happen.  The Platform ID also changes with each run of clinfo.

I plan to work up a minimal test case for my application because I think it fails in one of clGetPlatformIDs, clGetPlatformInfo, or clGetDeviceIDs.  And this failure is irrespective of whether it is the first OpenCL application run or not.  But my first level benchmark thus far as been whether clinfo behaves consistently.  My goal with this post is to try and determine the as minimal as is reasonable environment to run my OpenCL application.

Thanks for your time.

0 Likes
1 Solution
opello
Adept II

It turns out that I needed to have /etc/ati/amdpcsdb.default which is in the fglrx_*.deb but not the fglrx-core_*.deb.

View solution in original post

0 Likes
16 Replies

Hi Dan, I've approved your message and added you to the white list of developers. You can now post in any of the developer forums. I've also moved this question into the OpenCL forum.

Thanks!

0 Likes
nibal
Challenger

Try reinstalling your catalyst and sdk.

0 Likes

My installation is quite manual and I haven't been installing the SDK at all.  Is the SDK required?

My installation is basically unpacking the deb and copying files into a directory that becomes an initramfs.  Is there something specific I should look at?  I'm installing fglrx.ko, /etc/OpenCL/vendors/amdocl64.icd, /usr/bin/clinfo, /usr/lib/lib{amdhsasc64.so,amdocl64.so,atiadlxx.so,aticalcl.so,aticaldd.so,aticalrt.so,atiuki.so.1,atiuki.so.1.0,OpenCL.so.1} which seemed like the bare minimum.  I didn't think I needed the ACPI helpers from /etc/acpi, the empty configuration files (/usr/lib/fglrx-core/{,unblacklist_}ld.so.conf), the other scripts and applications I'm not using directly (/usr/sbin/atigetsysteminfo.sh, /usr/bin/{amd-console-helper,atiodcli}), or anything from /usr/share.  I've also not been copying /lib/fglrx/core-modprobe.conf because I don't think busybox modprobe will make use of it and there are no options to pass to the fglrx drive anyway so I'm not sure how it would affect the system.

0 Likes

Basically you did it the hard way, and messed up your installation. Clinfo is part of the SDK, and needs the SDK libraries to run.

0 Likes

The clinfo I'm using is distributed as part of the "AMD Catalyst™ 14.12 Proprietary Linux x86 Display Driver" package available here.  Specifically:

amd-catalyst-omega-14.12-linux-run-installers.zip/fglrx-14.501.1003/amd-driver-installer-14.501.1003-x86.x86_64.run/arch/x86_64/usr/bin/clinfo

Exactly the one I'm running is available via the "AMD Catalyst™ 14.12 Proprietary Ubuntu 14.04 x86_64 Minimal Video Driver for Graphics Accelerators (Non-X Support)" deb package available here.  Specifically:

fglrx-core_14.501-0ubuntu1_amd64_UB_14.01.deb/usr/bin/clinfo

It seems to me that clinfo wouldn't be distributed with the driver package if it didn't contain all of the necessary dependencies.  Could you speculate to the type of mess I have made which would allow it to run one time and not another?  I see that clinfo is also available in the AMD APP SDK, but then I'm left to wonder which version was distributed with which versions of the drivers and the prospect of using a newer or older version to get some more desirable functionality.

0 Likes

Sorry, I don't want to go into the details about your manual installation. Only to say that your catalyst is quite outdated and needs updating. In linux we are using 15.9. You definitely need the runtime ocl libraries from the SDK, since clinfo uses ocl calls for its reports. Smn from AMD could comment on the merits of including clinfo in catalyst, but plz, update to latest catalyst and SDK-3.0 and see if that helps. I'm not even sure if that, that old catalyst version is supported any more...

After all, you have just installed the display driver. You can't expect to develop ocl with just the display driver. You need the SDK as well. clinfo is ocl, not video, and a useful test that your ocl installation is good.

HTH,

Nikos

0 Likes

I can't get clinfo to work with my kernel in 15.9 it doesn't display anything and continually soft locks the kernel, which kernel version and distribution are you using?

What exactly are the OpenCL libraries from the SDK?  libOpenCL.so.1 is in the Catalyst driver distribution as I mentioned previously and /etc/OpenCL/vendors/amdocl64.icd references libamdocl64.so.  My understanding is that the *.icd files reference OpenCL Implementations available on the system.  Since I have the corresponding shared object I think I should be good to go.  This is why I'm asking what libraries you're referring to specifically.

The Installer Notes linked from the 15.9 Catalyst Drivers download page show that for headless use all you need is the fglrx-core-XXXXXX.deb (p.33).  The first paragraph on that page says:

Among the four downloaded deb packages from the AMD Website as described in section 2.1, install only the package that has both the Kernel driver module and the OpenCL™ component modules. These are sufficient for computational purposes.

Which is why I think I should only need the pieces from the fglrx-core deb and not anything from the AMD-APP-SDK.

I am not setting up a development environment but a minimal execution environment in case that wasn't clear from my first message.

0 Likes

opello wrote:

I am not setting up a development environment but a minimal execution environment in case that wasn't clear from my first message.

You are right. Didn't realize that, just about everyone here is a developer.

0 Likes

nibal wrote:
You are right. Didn't realize that, just about everyone here is a developer.

Creating a minimal execution environment doesn't mean that I'm not a developer.  It just means I'm not trying to solve the development use case at the moment.

0 Likes
dipak
Big Boss

Hi,

From your discussion, it seems that it's a catalyst installation issue. Just want to ask/share few point in this regard.

Do you see the same problem if you install the driver normally i.e. with X support?

I guess, you downloaded the catalyst 14.12 from here: http://support.amd.com/en-us/download/desktop/previous/detail?os=Linux%20x86_64&rev=14.12

You know, 14.12 is quite old. You may try the latest driver from here: Linux Download Center

You can also download the necessary dev packages from here Desktop.

BTW, as you said, you're using a G series SoC, I'm not sure whether it has anything to do with that or not. As I checked, I got another link for embedded SoC here: http://support.amd.com/en-us/download/embedded?os=Linux%20x86_64

In addition, I would like to share few points regarding the clinfo and SDK.

Actually, clinfo comes with both, APP SDK as well as the catalyst driver. As driver gets updated more frequently than SDK, usually catalyst packages contain more recent clinfo file than SDK.

In order to run an OpenCL program on a GPU device, you need to install appropriate Catalyst driver. You don't require any APP SDK for that. The APP SDK is needed for OpenCL development only. Basically the APP SDK provides a CPU runtime and all the necessary header files and libraries required to build an OpenCL program.

Hope I made it clear.

[Edited]

FYI: For supporting catalyst driver related issues, one support forum is available here: AMD Catalyst Drivers &amp; Software

Regards,

Thanks for the reply.

I never saw the segfault problem in my development environments (Ubuntu 12.04.5 and Ubuntu 14.04.1) which have X available but boot by default to a console environment.

Today I was able to test my custom environment with the contents of all 4 debs (fglrx_*.deb, fglrx-amdcccle_*.deb, fglrx-core_*.deb, fglrx-dev_*.deb) installed manually.  I still get soft lockups but I can run clinfo multiple times without getting a segfault.  I suppose there is some difference between the first run (when /dev/ati/* and whatever else is set up) and the nth run where the segfault occurs without the additional files installed.

After I didn't see soft lockups using the exact same environment on a different platform (Sapphire BP-FT3GS) that has the same SoC, I'm concluding that it's something about my hardware platform and not the drivers.

Do you think I should move my inquiry to the other forum you mentioned?  This discussion was moved here for me since I didn't have posting privileges prior to creating the thread.

0 Likes

After I didn't see soft lockups using the exact same environment on a different platform (Sapphire BP-FT3GS) that has the same SoC, I'm concluding that it's something about my hardware platform and not the drivers.

Do you think I should move my inquiry to the other forum you mentioned?  This discussion was moved here for me since I didn't have posting privileges prior to creating the thread.

I think, experts from related support forums may be more helpful in this kind of general hardware or software related issues. Whereas, as the name suggests, this forum is mainly focused on OpenCL specific issues. As you've been white-listed, now, you can post to any devguru forum as you feel suitable. So, I would suggest you to check the relevant forum(s).

Here a list of handful support forums:

Support Forums

You may even post your query here General Discussions if you don't find any specific one.

Regards,

0 Likes

I never saw the segfault problem in my development environments (Ubuntu 12.04.5 and Ubuntu 14.04.1) which have X available but boot by default to a console environment.

The same true for me too. I have been seeing the same issue with `clinfo` for AMD A10-7870K APU on Ubuntu 14.04.03 LTS (here is a link to original thread, AMD A10-7870K w/ Catalyst 15.9: `clinfo` throws the `Segmentation fault (core dumped)` after second ...).

Currently, I've been running the `fglrx_15.201-0ubuntu1_amd64_UB_14.01.deb` driver and boot by default to a SSH environment.

0 Likes
opello
Adept II

It turns out that I needed to have /etc/ati/amdpcsdb.default which is in the fglrx_*.deb but not the fglrx-core_*.deb.

0 Likes

Sigh, this is still issue even year later? https://community.amd.com/message/1307390#1307390

0 Likes