4 Replies Latest reply on Jul 22, 2011 4:59 AM by lupescu_grigore

    clCreateCommandQueue segfaults

    kawasakis900

      hello,

      i'm new to opencl. i want to run a kernel on an amd cpu (phenom 9500 quad). i have ati stream sdk 2.2 installed. i run linux 64-bit. i have done the appropriate initializations for shell environment variables according to the sdk documentation.

       

      the cpu is recognized and i get a valid context. also, when i run the CLInfo sample, the test is passed by the "OpenCL 1.1 ATI-Stream-v2.2 (302)" platform. 

       

      but when i try to create a command queue, i get a segfault. with debugging, i saw that the crash is exactly at the clCreateCommandQueue call.

       

      thanx in advance for any help provided

        • clCreateCommandQueue segfaults
          himanshu.gautam

          Does the code simply crashes at clCreateCommandQueue or gives some error code.

          Please post the code snippet.

            • clCreateCommandQueue segfaults
              kawasakis900
              first of all, thanx for the reply.
              Originally posted by: himanshu.gautam Does the code simply crashes at clCreateCommandQueue or gives some error code.

              as i said it segfaults. it doesn't return from clCreateCommandQueue (so i cannot check for the return value). i tried to see what happens, and i was surprised. if i launch it as an executable, it segfaults. if i run it with strace or gdb or valgrind it works (doesn't segfault). this is very strange. important:this happens ONLY when i try to create a command queue on the cpu. if i use the gpu, the program doesn't crash. i have an amd cpu and an nvidia gpu. i have gcc version 4.4.5
              Originally posted by: himanshu.gautam Please post the code snippet.

              i have the code attached. but it doesn't make a difference, since the same thing happens when i try to run a sample from the sdk. in the beginning i thought that i did something wrong with the driver installation. but this doesn't seem to be the case, since under gdb/strace/valgrind the programs run and give correct results. then i thought that may be there is something wrong with my code (e.g. improper initializations, so that in debug mode the program doesn't crash). but as i said, this also happens with the sdk samples and if i use the gpu device the program simply doesn't crash.

              $ ./HelloCL HelloCL! Getting Platform Information Creating a context AMD platform Getting device info Loading and compiling CL source Segmentation fault $ gdb ./HelloCL GNU gdb (GDB) 7.0.1-debian Copyright (C) 2009 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /home/xxx/progz/opencl/ati-stream-sdk-v2.2-lnx64/samples/opencl/cpp_cl/app/HelloCL/build/debug/x86_64/HelloCL...done. (gdb) run Starting program: /home/xxx/progz/opencl/ati-stream-sdk-v2.2-lnx64/samples/opencl/cpp_cl/app/HelloCL/build/debug/x86_64/HelloCL [Thread debugging using libthread_db enabled] HelloCL! Getting Platform Information Creating a context AMD platform Getting device info Loading and compiling CL source [New Thread 0x7ffff7f88710 (LWP 7259)] [New Thread 0x7ffff1d76710 (LWP 7260)] [New Thread 0x7ffff1564710 (LWP 7261)] [New Thread 0x7ffff0d52710 (LWP 7262)] [New Thread 0x7ffff0540710 (LWP 7263)] Running CL program Done Passed! [Thread 0x7ffff0d52710 (LWP 7262) exited] [Thread 0x7ffff1d76710 (LWP 7260) exited] [Thread 0x7ffff0540710 (LWP 7263) exited] [Thread 0x7ffff1564710 (LWP 7261) exited] [Thread 0x7ffff7f88710 (LWP 7259) exited] Program exited normally. (gdb) q /* cltst.c */ /* * gcc -g -Wall -Wextra -Iati-stream-sdk-v2.2-lnx64/include \ * -o cltst cltst.c ati-stream-sdk-v2.2-lnx64/lib/x86_64/libOpenCL.so */ #include <stdio.h> #include <stdlib.h> #include <string.h> #include <CL/opencl.h> #define STR_ERR "FATAL ERROR!" #define BAILOUT_MEM(f) \ do { \ fprintf(stderr, "%s %s: %d: " \ "%s: memory allocation failure.\n", \ STR_ERR, __FILE__, __LINE__, f \ ); \ exit(1); \ } while(0) #define ASSERT_CL_SUCC(errcode, fun1, fun2) \ assert_cl_succ(errcode, __FILE__, __LINE__, fun1, fun2) #define BAILOUT_CL_NO_PLAT(f) \ do { \ fprintf(stderr, "%s %s: %d: " \ "%s: no valid cl platform found.\n", \ STR_ERR, __FILE__, __LINE__, f \ ); \ exit(1); \ } while(0) #define BAILOUT_CL_NO_CTX(f) \ do { \ fprintf(stderr, "%s %s: %d: " \ "%s: could not create cl context.\n", \ STR_ERR, __FILE__, __LINE__, f \ ); \ exit(1); \ } while(0) #define BAILOUT_CL_NO_CMD_QUEUE(f) \ do { \ fprintf(stderr, "%s %s: %d: " \ "%s: could not create cl command queue.\n", \ STR_ERR, __FILE__, __LINE__, f \ ); \ exit(1); \ } while(0) struct cl_arch_t { cl_platform_id *platforms; cl_uint num_platforms; }; struct cl_ctx_t { cl_device_id *devices; cl_device_type type; cl_platform_id platform; cl_context context; cl_uint num_devices; }; struct cl_cmdqueue_t { cl_context context; cl_device_id device; cl_command_queue queue; }; void assert_cl_succ(cl_int errcode, const char *errfile, const int errline, const char *fun1, const char *fun2) { if(CL_SUCCESS!=errcode) { fprintf(stderr, "%s %s: %d: " "CL (%d): %s::%s.\n", STR_ERR, errfile, errline, (int) errcode, fun1, fun2 ); exit(1); } } int init_cl_arch(struct cl_arch_t *clarch) { const char thisfun[]="init_cl_arch"; cl_int errcode; cl_uint num_platforms; memset(clarch, 0x00, sizeof(struct cl_arch_t)); errcode=clGetPlatformIDs(0, NULL, &clarch->num_platforms); ASSERT_CL_SUCC(errcode, thisfun, "clGetPlatformIDs()"); num_platforms=clarch->num_platforms; if(!num_platforms) return 0; printf("LOG: %u cl platforms found\n", (unsigned int) num_platforms); fflush(stdout); clarch->platforms=(cl_platform_id *) malloc( num_platforms*sizeof(cl_platform_id) ); if(!clarch->platforms) BAILOUT_MEM(thisfun); errcode=clGetPlatformIDs(num_platforms, clarch->platforms, NULL); ASSERT_CL_SUCC(errcode, thisfun, "clGetPlatformIDs()"); return 1; } void deinit_cl_arch(struct cl_arch_t *clarch) { if(clarch->platforms) { free(clarch->platforms); clarch->platforms=NULL; } } void * get_cl_contextinfo_alloc( const cl_context_info param_name, const cl_context context) { const char thisfun[]="get_cl_contextinfo_alloc"; void *res; size_t databytes; cl_int errcode; errcode=clGetContextInfo(context, param_name, 0, NULL, &databytes); res=malloc(databytes); if(!res) BAILOUT_MEM(thisfun); errcode|=clGetContextInfo(context, param_name, databytes, res, NULL); ASSERT_CL_SUCC(errcode, thisfun, "clGetContextInfo()"); return res; } int init_cl_ctx(struct cl_ctx_t *clctx, const cl_device_type device_type, const cl_platform_id platform) { const char thisfun[]="init_cl_ctx"; cl_int errcode; cl_context_properties cps[3]; memset(clctx, 0x00, sizeof(struct cl_ctx_t)); clctx->type=device_type; cps[0]=CL_CONTEXT_PLATFORM; cps[1]=(cl_context_properties) platform; cps[2]=0; clctx->context=clCreateContextFromType( cps, device_type, NULL, NULL, &errcode ); if(CL_SUCCESS!=errcode) return 0; errcode=clGetDeviceIDs(platform, device_type, 0, NULL, &clctx->num_devices ); ASSERT_CL_SUCC(errcode, thisfun, "clGetDeviceIDs()"); clctx->platform=platform; printf("LOG: %u devices found on platform\n", (unsigned int) clctx->num_devices ); fflush(stdout); clctx->devices=(cl_device_id *) get_cl_contextinfo_alloc( CL_CONTEXT_DEVICES, clctx->context ); return 1; } void deinit_cl_ctx(struct cl_ctx_t *clctx) { const char thisfun[]="deinit_cl_ctx"; cl_int errcode; if(clctx->context) { errcode=clReleaseContext(clctx->context); ASSERT_CL_SUCC(errcode, thisfun, "clReleaseContext()"); } if(clctx->devices) { free(clctx->devices); clctx->devices=NULL; } } int init_cl_cmdqueue(struct cl_cmdqueue_t *clqueue, const cl_command_queue_properties properties, const cl_context context, const cl_device_id device) { const char thisfun[]="init_cl_cmdqueue"; cl_int errcode; memset(clqueue, 0x00, sizeof(struct cl_cmdqueue_t)); clqueue->queue=clCreateCommandQueue( context, device, properties, &errcode ); ASSERT_CL_SUCC(errcode, thisfun, "clCreateCommandQueue()"); clqueue->context=context; clqueue->device=device; return 1; } void deinit_cl_cmdqueue(struct cl_cmdqueue_t *clqueue) { const char thisfun[]="deinit_cl_cmdqueue"; cl_int errcode; errcode=clReleaseCommandQueue(clqueue->queue); ASSERT_CL_SUCC(errcode, thisfun, "clReleaseCommandQueue()"); } int main(void) { const char thisfun[]="main"; unsigned int plat_idx, dev_idx; cl_device_type device_type; struct cl_arch_t clarch; struct cl_ctx_t clctx; struct cl_cmdqueue_t clcmdq; plat_idx=0; dev_idx=0; device_type=CL_DEVICE_TYPE_ALL; if(!init_cl_arch(&clarch)) BAILOUT_CL_NO_PLAT(thisfun); if(!init_cl_ctx(&clctx, device_type, clarch.platforms[plat_idx])) BAILOUT_CL_NO_CTX(thisfun); if( !init_cl_cmdqueue(&clcmdq, 0, clctx.context, clctx.devices[dev_idx] ) ) BAILOUT_CL_NO_CMD_QUEUE(thisfun); printf("Init Ok\n"); deinit_cl_cmdqueue(&clcmdq); deinit_cl_ctx(&clctx); deinit_cl_arch(&clarch); printf("Deinit Ok\n"); return 0; } /* it compiles without warnings/errors. as you can see, i use platform 0 (cpu is platform 0, gpu is platform 1). */ $ ./cltst LOG: 2 cl platforms found LOG: 1 devices found on platform Segmentation fault $ gdb ./cltst GNU gdb (GDB) 7.0.1-debian Copyright (C) 2009 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /home/xxx/cltst...done. (gdb) run Starting program: /home/xxx/cltst [Thread debugging using libthread_db enabled] LOG: 2 cl platforms found LOG: 1 devices found on platform [New Thread 0x7ffff7f88710 (LWP 7113)] [New Thread 0x7ffff213f710 (LWP 7114)] [New Thread 0x7ffff192d710 (LWP 7115)] [New Thread 0x7ffff111b710 (LWP 7116)] [New Thread 0x7ffff0909710 (LWP 7117)] Init Ok Deinit Ok [Thread 0x7ffff192d710 (LWP 7115) exited] [Thread 0x7ffff7f88710 (LWP 7113) exited] [Thread 0x7ffff213f710 (LWP 7114) exited] [Thread 0x7ffff111b710 (LWP 7116) exited] [Thread 0x7ffff0909710 (LWP 7117) exited] Program exited normally. (gdb) q

                • clCreateCommandQueue segfaults
                  kawasakis900

                  more info: i also have the nm output and a backtrace of the coredump.

                   

                  2 things to take into consideration: 1) on the nm output i see that all cl functions have names like clCreateCommandQueue@@OPENCL_1.0 (notice the OPENCL_1.0). 2) on the backtrace you see that the problem was from libGL.so. i don't have ati's display driver installed (i have an nvidia card). i installed the stream sdk for the amd cpu i own.

                   

                  if somenone could help, i would be grateful.

                   

                  $ nm cltst 0000000000601798 d _DYNAMIC 0000000000601950 d _GLOBAL_OFFSET_TABLE_ 00000000004013a8 R _IO_stdin_used w _Jv_RegisterClasses 0000000000601778 d __CTOR_END__ 0000000000601770 d __CTOR_LIST__ 0000000000601788 D __DTOR_END__ 0000000000601780 d __DTOR_LIST__ 0000000000401768 r __FRAME_END__ 0000000000601790 d __JCR_END__ 0000000000601790 d __JCR_LIST__ 0000000000601a00 A __bss_start 00000000006019f0 D __data_start 0000000000401360 t __do_global_ctors_aux 0000000000400a40 t __do_global_dtors_aux 00000000006019f8 D __dso_handle w __gmon_start__ 000000000060176c d __init_array_end 000000000060176c d __init_array_start 00000000004012c0 T __libc_csu_fini 00000000004012d0 T __libc_csu_init U __libc_start_main@@GLIBC_2.2.5 0000000000601a00 A _edata 0000000000601a28 A _end 0000000000401398 T _fini 00000000004008b8 T _init 00000000004009f0 T _start 0000000000400ad4 T assert_cl_succ 0000000000400a1c t call_gmon_start U clCreateCommandQueue@@OPENCL_1.0 U clCreateContextFromType@@OPENCL_1.0 U clGetContextInfo@@OPENCL_1.0 U clGetDeviceIDs@@OPENCL_1.0 U clGetPlatformIDs@@OPENCL_1.0 U clReleaseCommandQueue@@OPENCL_1.0 U clReleaseContext@@OPENCL_1.0 0000000000601a18 b completed.6341 00000000006019f0 W data_start 0000000000400cb0 T deinit_cl_arch 000000000040107d T deinit_cl_cmdqueue 0000000000400f37 T deinit_cl_ctx 0000000000601a20 b dtor_idx.6343 U exit@@GLIBC_2.2.5 U fflush@@GLIBC_2.2.5 U fprintf@@GLIBC_2.2.5 0000000000400ab0 t frame_dummy U free@@GLIBC_2.2.5 0000000000400ce4 T get_cl_contextinfo_alloc U getchar@@GLIBC_2.2.5 0000000000400b4b T init_cl_arch 0000000000400fc7 T init_cl_cmdqueue 0000000000400de9 T init_cl_ctx 00000000004010e5 T main U malloc@@GLIBC_2.2.5 U memset@@GLIBC_2.2.5 U printf@@GLIBC_2.2.5 U puts@@GLIBC_2.2.5 0000000000601a00 B stderr@@GLIBC_2.2.5 0000000000601a10 B stdout@@GLIBC_2.2.5 #backtrace from core dump using gdb Core was generated by `./cltst'. Program terminated with signal 11, Segmentation fault. #0 0x00007fb88eb4e83f in ?? () from /usr/lib/libGL.so.1 (gdb) bt #0 0x00007fb88eb4e83f in ?? () from /usr/lib/libGL.so.1 #1 0x00007fb88eb4ec39 in ?? () from /usr/lib/libGL.so.1 #2 0x00007fb88eb5339b in ?? () from /usr/lib/libGL.so.1 #3 0x00007fb88eb5394c in ?? () from /usr/lib/libGL.so.1 #4 0x00007fb88ff138ba in start_thread (arg=<value optimized out>) at pthread_create.c:300 #5 0x00007fb890db002d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 #6 0x0000000000000000 in ?? () (gdb) q

                    • clCreateCommandQueue segfaults
                      lupescu_grigore

                      I am having exactly the same problem with Amd APP SDK 2.4 (Ubuntu 10.10 x64). I have an intel cpu ( Core 2 E7600) and an Nvidia GPU ( GF8600GT). I query for each platform and make a list of devices.

                      I get SIGSEGV on clCreateCommandQueue. On any other combination (only on Intel, only on Nvidia GPU, only on Amd CPU, only on Amd GPU) it works.

                      Also to note that on Amd CPU (X2 3600+) && Nvidia GPU ( GF210 ) it works on both devices. Only trouble i have is if i combine an intel cpu and an nvidia gpu and run on the intel cpu. If i run GDB it does not crash anymore on the intel cpu.

                      I tried to create the context bot with clCreateContext() and as in the Amd samples  clCreateContextFromType(), with no luck.

                       

                      Thank you in advance