Originally posted by: Raistmer AFAIK long kernel could lead to driver restart if it executed more than 2 seconds under Vista. No one of kernels in my app took so long. Moreover, I increased driver restart limit via registry to 15 seconds. Though running 4 copies of app lead to driver restart time to time. Why brook runtime can do correct scheduling to avoid long driver unavailability to OS watchdog timer?
Are all four instances using the same GPU? For such cases I use a mutex per GPU to allow only the start of one simultaneous kernel per GPU (the waiting kernels are served in a round robin fashion then).
Originally posted by: Raistmer @Gipsel Do you use named mutex with MW opt GPU app? Is it possible to use that mutex to serialize GPU access between few BOINC apps? If yes what its name?
The same mutex names are already used at MW@home as well as Collatz@home. And yes, running Collatz and MW on the same GPU works, although the current Collatz app uses much smaller execution domains than the MW one (and doesn't get the multi GPU stuff right), so the GPU time is not evenly split between those two apps. But that will hopefully change with the next version where I have more influence than on the current one.
But it may become superfluous with the modified client versions of Crunch3r (the modifications are now already in the official development versions), as it starts only a single instance per GPU and tells the app with a command line parameter ("--device #") which GPU to use (I guess it's the same behaviour as with CUDA).
Nevertheless, this is the code fragment which constructs the mutex names (as I mentioned one per GPU exists). The mutex names are "Global\\Milkyway_ATI_GPU_App_Mutex#", with "#" being the device number of the used GPU (the "which_device" variable in the code below).
char mutex_name[64]; strcpy(mutex_name, "Global\\Milkyway_ATI_GPU_App_Mutex"); [..] itoa(which_device, &(mutex_name[strlen(mutex_name)]),10); // construct mutex name for the chosen GPU GPU_mutex = CreateMutex(&GPU_secatt,false,mutex_name); // opens named mutex, open it, if it already exists, but never obtain it directly if (GPU_mutex==NULL) // if it fails { GPU_mutex=OpenMutex(MUTEX_MODIFY_STATE,false,mutex_name); // try again with less rights if (GPU_mutex==NULL) { cerr<<"Couldn't obtain mutex for GPU access!"<<endl<<flush; return(1); } } // kernel calls are enclosed in the following construct WaitForSingleObject(GPU_mutex,INFINITE); // obtain mutex (waiting for the GPU to become available), wait forever, if necessary GPU_time_s = dtime(); [.. kernel calls ..] GPU_time += dtime() - GPU_time_s; ReleaseMutex(GPU_mutex);
Originally posted by: Raistmer "&GPU_secatt" Do you use some specific access rights? Just NULL will not go?
I don't use specific rights in the moment, but I was thinking about it, because the default access rights don't allow another user to access the same mutex. That means one can't test an application standalone when another instance is launched by the BOINC client. But it doesn't matter on a normal system.
GPU_secatt.lpSecurityDescriptor=NULL; GPU_secatt.bInheritHandle=false; GPU_secatt.nLength=sizeof(GPU_secatt);
Originally posted by: Raistmer
EDIT: much more stable now, only single driver restart so far
Are you testing with Vista or WinXP? I would really like to know if the stability problems with newer drivers under XP are gone with the SDK1.4 you use (at least I guess you are using 1.4).
Originally posted by: Gipsel Originally posted by: Raistmer
EDIT: much more stable now, only single driver restart so far
Are you testing with Vista or WinXP? I would really like to know if the stability problems with newer drivers under XP are gone with the SDK1.4 you use (at least I guess you are using 1.4).