1 Reply Latest reply on Mar 16, 2011 3:20 PM by hisense1


      Why only first call to calCtxIsEventDone() non-blocking?

      I've code very similar to example in CAL documentation:

      int Run(...)
       CALmodule module = 0;

       calModuleLoad(&module, ctx, image);
       CALfunc func = 0;
       CALname inName, outName, constName;

       calModuleGetEntry(&func, ctx, module, "main");
       calModuleGetName(&inName, ctx, module, "i0");
       calModuleGetName(&outName, ctx, module, "o0");
       calModuleGetName(&constName, ctx, module, "cb0");

       calCtxSetMem(ctx, inName, inputMem);
       calCtxSetMem(ctx, outName, outputMem);
       calCtxSetMem(ctx, constName, constMem);

       CALdomain domain = {0, 0, dimx, dimy};
       CALevent e = 0;
       if (calCtxRunProgram(&e, p->ctx, func, &domain) != CAL_RESULT_OK) {
        return 0;

      int result = calCtxIsEventDone(p->ctx, e);

       while (calCtxIsEventDone(p->ctx, e) == CAL_RESULT_PENDING);

       calResMap((CALvoid**)&oRes, &pitch, outputRes, 0);
       // get results

       calModuleUnload(ctx, module);
       return 1;

      void main()

       for (i=0; i<n; i++) {

      Problem is that only first call to calCtxIsEvenDone non-blocking, i.e.
      it returns CAL_RESULT_PENDING, so it's possible to do cpu calculations
      parallel to kernel execution. All other calls to calCtxIsEventDone ends
      as stall until kernel fully executed and result returned always CAL_RESULT_OK, so no parallel CPU calculations possible.

      Is it CAL bug or something else?

        • calCtxIsEventDone

          I want to bump up this thread also mention this thread:



          And I think many more with calCtxIsEvenDone and calddiGetExport can someone from ATI can respond for this problems ?

          I also faced this "fake" - free CPU when GPU doing computing, the issue is simple if I run kernel in 2-4 cards (or 5970 cores) then I must do a Sleep(1) for every thread after calCtxIsEventDone is called, and then everything is ok CPU is in 1-10% usage mode... but - only when we speak about 5970 or 5870 in 725MHz mode, when clock is over this rate then Sleep(1) start to delay whole program and card(s) usage is on 80% only instead of 97% before. Removing of Sleep(1) and calling calCtxIsEventDone doing simple thing whole CPU power is sucked by calddiGetExport function inside calCtxIsEventDone 3GHz !!! Small sleep as microsecond give nothing... the magic looks like a this if I cal ctxIsEventDone and after it do Sleep 1000 microseconds and cal ctxIsEventDOne in this such of loop everything working perfect CPU is almost not used  but card with clock over 725MHz is used on 80% only, now when I change Sleep to 999 microseconds, whole CPU use is 100% and card also is loaded on 97%.

          Why this 1microsecond make so big difference ? Maybe the most small value there is 1milisecond and only calling ctxIsEventDone in this amount of time make this function non-blocking if I call at time < 1ms then this switch to blocking ? Or simple please mention in documentation for multi card\multi thread system please buy i7 CPU cuz every card thread need a one CPU core.

          So honestly question for ATI stuff how the hell this calCtxIsEvenDone working inside ? And what this doing and also why description of this is non-blocking function when in reality this is blocking.