9 Replies Latest reply on Dec 30, 2011 6:21 AM by nou

    Parallel kernel building

    d.a.a.

      Hi,

      I'm developing an OpenCL application that assembles a lot of arbitrary kernels at runtime (via genetic programming). Is there a way to build OpenCL kernels in parallel (on the CPU) using, preferably, OpenCL intrinsics? By parallel I mean many kernels concurrently.

        • Parallel kernel building
          nou

          try call clBuildProgram() in multiple threads.

            • Parallel kernel building
              d.a.a.

              Would it be possible to use a native kernel instead?

                • Parallel kernel building
                  keldor314

                  I had problems building kernels on separate threads - apparently the compiler isn't actually thread safe.  Mind you, this was a while back, but still...

                  The thing that worked for me was to spawn off separate processes, one per CPU core, and send them stuff to compile over pipelines.

                    • Parallel kernel building
                      d.a.a.

                       

                      Originally posted by: keldor314 I had problems building kernels on separate threads - apparently the compiler isn't actually thread safe.  Mind you, this was a while back, but still...


                      According to the OpenCL 1.1 Specification (Section A.2),  all API calls should be thread-safe, except clSetKernelArg. Were you using the 1.1 spec?

                    • Parallel kernel building
                      himanshu.gautam

                      Hi d.a.a,

                      Does multithreaded clBuildProgram work for you?

                      And using native kernels you may be able to build them parallely, but running them might be problematic as IIRC ,GPUs don't support it yet.

                        • Parallel kernel building
                          d.a.a.

                           

                          Originally posted by: himanshu.gautam Hi d.a.a,

                           

                          Does multithreaded clBuildProgram work for you?



                           

                          I'm investigating the native-kernel way of doing it. I'd like to use OpenCL intrinsics, otherwise I would need to use a (portable) third-party library for multi-threading execution.

                           

                           

                          And using native kernels you may be able to build them parallely, but running them might be problematic as IIRC ,GPUs don't support it yet.


                          Sorry, I don't get it. What exactly GPUs don't support?

                            • Parallel kernel building
                              nou

                              native kernel is just another function call. and it doesn't guarantee that it got executed in parallel. implementation can serialize execution of native kernels.

                              you should just use some threading lib like pthread, boost::thread or with new C++11 std::thread

                              GPU will never support native kernels as it must be executed on host CPU.

                                • Parallel kernel building
                                  d.a.a.

                                   

                                  Originally posted by: nou native kernel is just another function call. and it doesn't guarantee that it got executed in parallel. implementation can serialize execution of native kernels.


                                  Hi nou,

                                  Even on a multi-core CPU? Does AMD APP v2.6 serialize native kernels? If so, what's the purpose of having native kernels in the first place?

                                  you should just use some threading lib like pthread, boost::thread or with new C++11 std::thread


                                  I'll check out those options, but I really would like to approach the parallel building via the OpenCL API.

                                  GPU will never support native kernels as it must be executed on host CPU.


                                  Yes, of course. But that's not an issue since the parallel kernel building would be done on the CPU device.

                                    • Parallel kernel building
                                      nou

                                      you must create multiple queues oherwise it will be serialized as current AMD APP don't support out of order queue. also even with multiple queues it is not guarantied that it will be executed in parralel. and it will be cumbersome native kernels are not intended to create threads in your program.