5 Replies Latest reply on Jul 8, 2012 6:26 PM by neverknovvsbestt

    D3D10 Sharing

    neverknovvsbestt

      I'm trying to use the d3d sharing extensions so that I can modify the values of a DirectX10 vertex buffer in an OpenCL kernel and then render the vertices using DirectX10.

       

      I've confirmed that my device supports d3d10 sharing and currently can retrieve function pointers to the sharing functions for d3d10 and call them with no problem. I can also create an OpenCL buffer object from a DirectX10 buffer with no problem.

       

      However, I'm having trouble compiling my kernel code with the pragma that enables the extension:

       

      line 1: error: can't enable

                all OpenCL extensions or unrecognized OpenCL extension

        #pragma OPENCL EXTENSION cl_khr_d3d10_sharing : enable

                                                                                         ^

      Is there a special way to compile programs that use extensions? Special build options?

       

      Also, I'd like to point out that I'm using clGetDeviceIDs instead of clGetDeviceIdsFromD3D10KHR. Is this an issue? Under what circumstances would I want to use one over the other?

       

      Thank you for your time

        • Re: D3D10 Sharing
          nou

          this was discussed before. my understanding of spec is that you need enable extension via #pragma only when that extension affect kernel language. try it without that #pragma

          1 of 1 people found this helpful
            • Re: D3D10 Sharing
              neverknovvsbestt

              Thanks nou,

               

              Your're correct, it does compile fine without the pragma. However, it doesn't seem to alter the data in the buffer when I execute the kernel... Here's some code snippets (hopefully someone can point out something Im doing wrong):

               

              Here's most of my setup code

               

              const char *testd3d = 
              "__kernel void testd3d(__global float *x) {\n"
              "\n"
              "    // Get the index of the current element to be processed\n"
              "          int index = get_global_id(0);\n"
              "          x[index] = x[index] * 0.9f;\n"
              "}\n";
              
                       // seed for random
                        srand((unsigned int)time(0));
              
                        // create d3d buffer
                        vertex_buffer = new VertexBuffer(sizeof(VertexTypes::XYZ), LIST_SIZE, d3d10i);
                        VertexTypes::XYZ *vertex_data = (VertexTypes::XYZ *)vertex_buffer->StartMapping();
                        for(unsigned int i = 0; i < vertex_buffer->GetVertexCount(); ++i) {
                                  vertex_data[i].x = (float)(rand() % 100 - 50);
                                  vertex_data[i].y = (float)(rand() % 100 - 50);
                                  vertex_data[i].z = (float)(rand() % 100 - 50);
                        }
                        vertex_buffer->StopMapping();
                
                        dx10_buffer.data = vertex_buffer->GetBuffer();
                        dx10_buffer.size = vertex_buffer->GetVertexCount();
                        status = ocl.BufferCreateFromDX10(dx10_buffer);
                        status = ocl.ProgramCreate(program_testd3d, testd3d);
                        status = ocl.KernelCreate(kernel_testd3d, program_testd3d, "testd3d");
              
              

               

              This should update the state of the vertices in the buffer

               

                     status = ocl.KernelSetInput(kernel_testd3d, dx10_buffer, 0);
                        status = ocl.KernelEnqueueNDRange(kernel_testd3d, LIST_SIZE * 3, 64);
                        status = ocl.CommandQueueFlush();
                        status = ocl.CommandQueueFinish();
              
              

               

              After that I just render the original d3d buffer using dx calls.

               

              Here are the definitions to some of the above functions:

               

              cl_int OCL::ProgramCreate(Program &program, const char *source) {
                        size_t source_size = strlen(source);
                        program.program = clCreateProgramWithSource(context, 1, (const char **)&source, (const size_t *)&source_size, &this->status);
              
              
                        if(this->status == CL_SUCCESS) {
                                  this->status = clBuildProgram(program.program, 1, &this->device_id, NULL, NULL, NULL);
                        }
              
              
                        return this->status;
              }
              
              cl_int OCL::KernelCreate(Kernel &kernel, Program &program, const char *entry_function) {
                        kernel.kernel = clCreateKernel(program.program, entry_function, &this->status);
                        return this->status;
              }
              
              cl_int OCL::KernelSetInput(Kernel &kernel, Buffer &buffer, unsigned int index) {
                        this->status = clSetKernelArg(kernel.kernel, index, sizeof(cl_mem), (void *)&buffer.memory);
                        return this->status;
              }
              
              
              cl_int OCL::KernelEnqueueNDRange(Kernel &kernel, size_t global_size, size_t local_size) {
                        size_t global_size_l = global_size;
                        size_t local_size_l = local_size;
                        this->status = clEnqueueNDRangeKernel(this->command_queue, kernel.kernel, 1, NULL, &global_size_l, &local_size_l, 0, NULL, NULL);
                        return this->status;
              }
              
              cl_int OCL::CommandQueueFlush(void) {
                        this->status = clFlush(command_queue);
                        return this->status;
              }
              
              
              cl_int OCL::CommandQueueFinish(void) {
                  this->status = clFinish(command_queue);
                        return this->status;
              }
              
              cl_int OCL::BufferCreateFromDX10(Buffer &buffer) {
                        // Retrieve pointer to extension function
                        clCreateFromD3D10BufferKHR_fn function = (clCreateFromD3D10BufferKHR_fn)clGetExtensionFunctionAddressForPlatform(
                                  this->platform_id, "clCreateFromD3D10BufferKHR");
              
              
                        buffer.memory = (*function)(
                                  this->context,
                                   CL_MEM_READ_WRITE,
                                  (ID3D10Buffer *)buffer.data,
                                  &this->status
                        );
                
                        return this->status;
              }
              
              
              
                • Re: D3D10 Sharing
                  Wenju

                  Hi neverknowsbestt,

                  First, you must be sure that your kernel has really been run, that means you really exceuted the clenqueueNDrange(). Second, you'd better check out the buffer before executing the kernle.

                    • Re: D3D10 Sharing
                      neverknovvsbestt

                      Thanks Wenju,

                       

                      clCreateFromD3D10BufferKHR and clEnqueueNDRangeKernel both return CL_SUCCESS when called, but when I render the vertices they dont move from their initial mapped positions.

                       

                      Is there any other methods I can use to ensure that the calls are executing properly?

                • Re: D3D10 Sharing
                  neverknovvsbestt

                  I figured it out!

                   

                  I wasn't calling clEnqueueAcquireD3D10ObjectsKHR or clEnqueueReleaseD3D10ObjectsKHR. This is my order of operations now:

                   

                  1. clEnqueueAcquireD3D10ObjectsKHR
                  2. clSetKernelArg
                  3. clEnqueueNDRangeKernel
                  4. clFinish
                  5. clEnqueueReleaseD3D10ObjectsKHR

                   

                  Hope this thread helps others with d3d sharing!