Archives Discussions

dmeiser · ‎01-25-2011

Installing amd opencl platform breaks previously working nvidia platform.

Hi,

I've been using an NVIDIA opencl platform successfully for a while. Then I installed the AMD stream sdk as well in order to test some of my opencl code on the cpu. Now I get weird segmentation faults whenever I query the platforms on the system.

The output from CLInfo is attached.

Any suggestions or comments are appreciated.

Cheers,

Dominic

Number of platforms: 2 Platform Profile: FULL_PROFILE Platform Version: OpenCL 1.1 ATI-Stream-v2.3 (451) Platform Name: ATI Stream Platform Vendor: Advanced Micro Devices, Inc. Platform Extensions: cl_khr_icd cl_amd_event_callback cl_amd_offline_devices Platform Profile: FULL_PROFILE Platform Version: OpenCL 1.0 CUDA 3.2.1 Platform Name: NVIDIA CUDA Platform Vendor: NVIDIA Corporation Platform Extensions: cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll Platform Name: ATI Stream Number of devices: 1 Device Type: CL_DEVICE_TYPE_CPU Device ID: 4098 Max compute units: 4 Max work items dimensions: 3 Max work items[0]: 1024 Max work items[1]: 1024 Max work items[2]: 1024 Max work group size: 1024 Preferred vector width char: 16 Preferred vector width short: 8 Preferred vector width int: 4 Preferred vector width long: 2 Preferred vector width float: 4 Preferred vector width double: 0 Native vector width char: 16 Native vector width short: 8 Native vector width int: 4 Native vector width long: 2 Native vector width float: 4 Native vector width double: 0 Max clock frequency: 2667Mhz Address bits: 64 Max memory allocation: 1073741824 Image support: No Max size of kernel argument: 4096 Alignment (bits) of base address: 1024 Minimum alignment (bytes) for any datatype: 128 Single precision floating point capability Denorms: Yes Quiet NaNs: Yes Round to nearest even: Yes Round to zero: Yes Round to +ve and infinity: Yes IEEE754-2008 fused multiply-add: No Cache type: Read/Write Cache line size: 64 Cache size: 32768 Global memory size: 3221225472 Constant buffer size: 65536 Max number of constant args: 8 Local memory type: Global Local memory size: 32768 Kernel Preferred work group size multiple: 1 Error correction support: 0 Unified memory for Host and Device: 1 Profiling timer resolution: 1 Device endianess: Little Available: Yes Compiler available: Yes Execution capabilities: Execute OpenCL kernels: Yes Execute native function: Yes Queue properties: Out-of-Order: No Profiling : Yes Platform ID: 0x7f1002f86880 Name: Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz Vendor: GenuineIntel Driver version: 2.0 Profile: FULL_PROFILE Version: OpenCL 1.1 ATI-Stream-v2.3 (451) Extensions: cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_device_fission cl_amd_device_attribute_query cl_amd_media_ops cl_amd_popcnt cl_amd_printf Platform Name: NVIDIA CUDA Number of devices: 1 Device Type: CL_DEVICE_TYPE_GPU Device ID: 4318 Max compute units: 12 Max work items dimensions: 3 Max work items[0]: 512 Max work items[1]: 512 Max work items[2]: 64 Max work group size: 512 Preferred vector width char: 1 Preferred vector width short: 1 Preferred vector width int: 1 Preferred vector width long: 1 Preferred vector width float: 1 Preferred vector width double: 0

dmeiser · ‎01-25-2011

This very simple program that just queries the names of the two installed platforms also crashes.

#include "CL/cl.h" #include <stdio.h> #define MAX_PLATFORM_NAME 120 int main() { cl_uint numPlatForms; cl_int err; cl_platform_id *platforms = 0; size_t size; char name[MAX_PLATFORM_NAME]; err = clGetPlatformIDs(0, 0, &numPlatForms); if (numPlatForms > 0){ platforms = (cl_platform_id *)malloc(numPlatForms * sizeof(cl_platform_id)); err = clGetPlatformIDs(2, platforms, 0); err = clGetPlatformInfo(platforms[1], CL_PLATFORM_NAME, MAX_PLATFORM_NAME * sizeof(char*), (void*)name, 0); printf("Platform #1: %s\n", name); err = clGetPlatformInfo(platforms[0], CL_PLATFORM_NAME, MAX_PLATFORM_NAME * sizeof(char*), (void*)name, 0); printf("Platform #2: %s\n", name); free(platforms); platforms = 0; } return 0; }

himanshu_gautam · ‎01-25-2011

dmeiser,

Does your clInfo crash at the point till which you have posted the clInfo Output.

I am able to run samples on both ATI GPU and NV GPU on my system.

My system config:

SDK 2.3, Catalyst 10.12, NV Driver latest(can't remember the version),

Cypress (ATI GPU) and Tesla (NV GPU)

What is your system configuration?

dmeiser · ‎01-25-2011

Dear Himanshu,

Thank you for your reply.

Yes, clGetPlatformInfo crashes with error code -30 (invalid value). My setup is APP SDK 2.3, NV driver is 260.19.26 (latest), all for 64 bit linux. I don't have any AMD drivers installed since I don't have an AMD graphics card. Should I install these nonetheless? Even if I plan to run all opencl calculations with the AMD platform just on the CPU?

Thanks in advance,

Dominic

laughingrice · ‎07-02-2011

Originally posted by: dmeiser This very simple program that just queries the names of the two installed platforms also crashes.

This is already an old topic which I ran into by mistake so I don't know if I'm not just raising the dead, but if there is still interest in this, I have three points.

1. Personally I have had no problem running NVIDIA, Intel and AMD OpenCL SDKs on a system that has an Intel CPU and NVIDIA GPU without any hint of AMD products (and the same code ran fine on a system with only AMD CPU and GPU as well, no Intel or NVIDIA, neither product or SDK). It never mattered what OpenCL.lib I linked against.

2. I've seen the error code that you are reporting (unsupported value). What I've seen is that not all platforms support all querry values, especially since some are extended values and not the standard ones. Even with the standard ones, some of the more exotic querry values are not supported. You should change the code to just move on the the next query value when you see that error code instead of return with an error. In fact, it seems that all platforms are recognized, so it seems like everything should run. Try one of the SDK samples rather than oclDeviceQuery.

3. Your test program has at least one bug that I spotted.
You allocate your platform name buffer as
char name[MAX_PLATFORM_NAME];
but then you pass it's size as
MAX_PLATFORM_NAME * sizeof(char*)
when you should pass
MAX_PLATFORM_NAME * sizeof(char)
Notice that you should use char rather than char * in the sizeof. sizeof(char) == 1 while sizeof(char *) == 4 or 8 (depending on whether your code is 32bit or 64bit). So you are probably getting a buffer overrun and a resulting segmentation fault (which is wierd since the platform name seems to be short enough to work, but still, this is probably the problem).

MicahVillmow · ‎01-25-2011

dmeiser,
We only support systems that have AMD products in them. Please install NVidia's or Intel's OpenCL SDK software.

nou · ‎01-25-2011

from which SDK do you have libOpenCL.so? try change it from AMD to nVidia and vice versa. it just an idea.

dmeiser · ‎01-25-2011

Dear nou,

I linked against both version of libOpenCL.so, no difference.

Cheers,

Dominic

dmeiser · ‎01-25-2011

Dear Micah,

My system has an AMD CPU. Isn't the ability to run opencl code on a system with just a cpu supposed to be one of the nice features of the AMD stream sdk? To me this is one of the few areas where AMD has a leg up on the competition (nvidia's sdk cannot compile opencl for cpu and Intel's SDK is an alpha version that runs only under windows). Seems a bit odd that would give up that advantage.

Cheers,

Dominic

MicahVillmow · ‎01-25-2011

dmeiser,
Your compute info shows that you have an Intel CPU, not an AMD CPU.
Name: Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz

Since you have neither an AMD CPU or an AMD GPU, we do not provide any guarantees or support with our SDK.

dmeiser · ‎01-25-2011

Dear Micah,

I'm not expecting any guarantees nor official support from AMD. Just posting in a public forum to see if anybody has any suggestions/solutions. Doesn't have to be a solution that's officially sanctioned by AMD nor would I expect anybody at AMD to invest any time researching my problem.

Out of curiosity, would you really expect the behavior to be any different if my CPU was made by AMD?

Cheers,

Dominic

MicahVillmow · ‎01-25-2011

We verify that our SDK works on AMD products, that is why I expect the behavior to be different. Most likely it is the NV SDK that is conflicting with our SDK, try running only our SDK and removing their SDK from the system.

dmeiser · ‎01-25-2011

Dear Micah,

Okay will do. Thanks.

Dominic

bubu · ‎01-25-2011

I really think we have a problem here, Houston.

Each IHV is saying "We only support our products, bleh bleh bleh".

The problem is that the devolopers MUST compile the program using a specific SDK ( ATI's one, NVIDIA's one or Intel's one ), so it will never be 100% compatible and tested to work properly on 3rd-party devices.

I simply propose Khronos to solve the problem: the SDK must be vendor-independent ( and it should be open source if possible ). It's not very complicated: for Windows, just create a .lib that finds the OpenCL ICD DLL in the registry, load the DLL with LoadLibrary and get the function pointers, etc... Khronos should guarantee a 100% compatibility with all the implementations.

As programmers, we NEED a platform-indepent SDK which is guaranteed to run properly all the implementations.

jross · ‎01-25-2011

This thread seemed to be missing the point as to why it's failing...

Look at the CLInfo source code. What is happening is that the AMD SDK supports OpenCL 1.1 and Nvidia only supports up to 1.0 with your configuration. There is Nvidia OpenCL 1.1 beta support if you look.

To make it work with Nvidia, Compile with the OpenCL 1.0 headers from Nvidia. Or manually go into the CLInfo code and strip out any 1.1 stuff.

MicahVillmow · ‎01-25-2011

bubu,
I don't see a problem here at all. It is not that we support only our products, it is that we only support systems that at least have one of our products in them. Which means we support AMD CPU + NV GPU, Intel CPU + AMD GPU, AMD GPU + AMD CPU and any other system that has at least an AMD supported product in it. We don't write software for other companies products, so we cannot support them. If he wants to develop OpenCL on an Intel CPU and an NV GPU, he needs to install the Intel SDK and the NV SDK.

The API is platform independent, that is all the programmer requires for cross-platform compatibility. If the developer target a specific version of the API with a valid program, then all vendors that support that version of the API should run the application on their hardware. If you can find a valid application that works on another vendors OpenCL implementation and has problems with our implementation on our hardware, then let us know and we will fix it.

dmeiser · ‎01-25-2011

Dear Micah,

I see your point. I understand perfectly well that it is not AMD's responsibility to support somebody else's hardware.

That's why I'm not asking AMD support for an official solution. I'm simply asking fellow programmers whether somebody has encountered the same problem and whether they have found a solution.

I believe that it's fairly natural to try to exploit the llvm based backend employed in the AMD SDK to compile OpenCL for the CPU, any CPU supporting a sufficient level of SSE.

Thanks again for your input.

Cheers,

Dominic

nou · ‎01-25-2011

you have error in your code. not sizeof(char*) but only sizeof(char)

and after prefered vector width it query a native vector width. which is OpenCL 1.1 feature. so that is error too.

add #undef CL_VERSION_1_1 after include of cl.hpp so it will not query 1.1 stuff. and recompile it.

dmeiser · ‎01-25-2011

Dear nou,

Thanks. I fixed those bugs. It doesn't crash anymore. CLInfo runs without problems.

Cheers,

Dominic

bubu · ‎01-25-2011

Originally posted by: MicahVillmow The API is platform independent, that is all the programmer requires for cross-platform compatibility.

That's not enough because, currently, if you compile a program with the ATI's SDK won't work with an NVIDIA card ( it gives an exception when you try to execute the kernel ) ... and that's a big problem because I simply cannot use the ATI's SDK to write my multivendor OpenCL app.

I got similar problems trying to run a program compiled with the NV SDK in an ATI ... or if I compile it with the Intel SDK but I run the app with a NV card. So, at this point, it's nearly impossible to develop a platform-independent CL app... and, for me, that's a big problem.

The OpenCL.lib should just load the ICD in the registry, load the DLLs and get the function pointers... so I really cannot understand what you're doing inside to avoid an app compiled with your SDK to make that exception using a NV card...

So at this point we have two solutions:

1. Plug a NVIDIA card ( + later Intel, etc...) on your AMD computer and fix the problem in your SDK.... but that's problematic because each CL implementor should test their SDK compatibility against all the 3rd-party implementations.

2. Create a a group at Khronos to make a vendor-independent OpenCL.lib SDK which should be certified to run well with all the OpenCL implementations...

All the CL SDKs should be tested to run with all the implementations. Due to that, the option #2 is the unique possible.

MicahVillmow · ‎01-25-2011

bubu,
If you compile for an ARM machine using C, it won't run on an x86 machine, even
though standard C is cross-platform. It is the same concept. You cannot compile the
kernel for an ATI chip and then expect to run it on an NV chip or even on a CPU as
the ISA is different. This is one of the reasons why OpenCL uses online compilation
instead of offline compilation. You only have to compile for the platform/device that
is on the machine at the time of execution and not at the time of development. This
works fine for OpenGL and DirectX and I see no problem with it occurring in OpenCL.

Now your second option seems to be the ICD, which is the model Khronos already uses.
This extension is specified here:
http://www.khronos.org/registr.../cl_khr_icd.txt

This source code is provided by Khronos and we ship it in our SDK, as this is what
OpenCL.dll is. There were some issues with NV using the wrong calling convention in
their original SDK but that has been fixed in their 3.0 SDK according to a post here.

http://www.khronos.org/message...php?f=28&t=2562

This is not an SDK, but just a trampoline that loads up all valid OpenCL implementations installed on the system.

bubu · ‎01-25-2011

This source code is provided by Khronos and we ship it in our SDK, as this is what OpenCL.dll is.

And where's the OpenCL.dll source code at Khronos, pls?

MicahVillmow · ‎01-25-2011

From the ICD spec.
" The official source for the ICD. This is currently available
only to Khronos members from our internal Subversion
repository under /repos/OpenCL/trunk/icd/"

bubu · ‎01-25-2011

Originally posted by: MicahVillmow From the ICD spec. " The official source for the ICD. This is currently available only to Khronos members from our internal Subversion repository under /repos/OpenCL/trunk/icd/"

And may I ask why is that not available for everybody ? There should be just a few registry calls, LoadLibary and GetProcAddress() there ... nothing that can affect the IP, or am I wrong?

And why is it compiled into a DLL and not inlined in the .h defs?

MicahVillmow · ‎01-25-2011

You would have to ask Khronos as they decide these issues.

bubu · ‎01-25-2011

Ok, thanks Micah. All clarified.

Archives Discussions

amd and nvidia platform incompatibility?