1 Reply Latest reply on Jan 20, 2010 8:40 AM by rainysky

    What is the proper way to use multiple GPUs in single thread

    rainysky

      Does anyone have experience of using multiple GPUs in single thread?
      We got problem when using multiple GPUs concurrently.

      This is what I do:

      1. Prepare input data for GPU #0 with calResMap/calResUnmap
      2. Prepare input data for GPU #1 with calResMap/calResUnmap
      3. Use calCtxRunProgram to start the program on GPU #0
      4. Use calCtxRunProgram to start the program on GPU #1
      5. Use calCtxIsEventDone to wait program on GPU #0 to finish
      6. Use calCtxIsEventDone to wait program on GPU #1 to finish
      7. Read output data from GPU #0 with calResMap/calResUnmap
      8. Read output data from GPU #1 with calResMap/calResUnmap

      Both GPUs are same type and I equally divide the load to two GPUs. The prgoram need 2 seconds to finish.
      In step 5, it takes calCtxIsEventDone 2 seconds to complete. That is fine.
      In step 6, the second calCtxIsEventDone also takes 2 seconds. But I am expecting the second calCtxIsEventDone to takes almost no time if both GPUs run concurrently.

      Any help?

        • What is the proper way to use multiple GPUs in single thread
          rainysky

          I solved the problem myself.

          In "ATI Stream Computing User Guide":
          For improved performance, calCtxRunProgram does not immediately
          dispatch the program for execution on the stream processor. To force the
          dispatch, the application must call calCtxIsEventDone and calCtxFlush on
          the corresponding event.

          So I have to call calCtxIsEventDone and calCtxFlush before the program will actually start.