cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

freeagle
Journeyman III

Unhandled exception - memory access violation

Hello everyone.

 

I have a program using Brook+ that in a cycles uploads some data to the GPU, run some kernels with the data, and downloads it back. All of these kernels are scatter/gather ones. The problem is, that I get an unhandled exception caused by memory access violation. I'm almost certain this exception comes from the RPC threads, as this exception is not thrown from my main or worker threads. And I'm again almost certain this is a problem between my program and GPU, because when is switch to CPU backend, the problem does not occur.

Under Windows, the exception is thrown always somewhere in the first cycle, under linux it's random.

I tried to make all my data uploads/downloads pinned, in case the problem was somewhere in the copy from host to PCI memory. But with no luck.

Has anyone had a similar problem? Should I continue to work with CPU backend and hope the problem is discovered and removed?

I'm also considering rewriting it with OpenCL. But I'm not sure if its worth it. If my problem is caused inside Brook+, than maybe. If it's in the CAL layer or driver, maybe it's worth a wait. What do you think?

Thank you very much

Dalibor

 

PS: I have the lates drivers both under Win and Linux, CAL 1.4.0, and tried both Brook+ 1.4.0 and 1.4.1

0 Likes
2 Replies
gaurav_garg
Adept I

Can you post your code here?

0 Likes

Yes, I can, but I'd rather not post the kernel code...

The methods starting with sb* are kernel calls

/** stream creation **/ unsigned int mainDim[2] = { init->mainW * puzzle->width(), init->mainH * puzzle->height() }; main[0] = new brook::Stream< uint >( 2, mainDim ); main[1] = new brook::Stream< uint >( 2, mainDim ); unsigned int randomsDim[2] = { init->randW * puzzle->width(), init->randH * puzzle->height() }; randoms = new brook::Stream< uint >( 2, randomsDim ); randomBreeds = new brook::Stream< uint >( 2, randomsDim ); randomMask = new brook::Stream< uint >( 2, randomsDim ); unsigned int rsDim[2] = { init->randW * puzzle->pieceCount(), init->randH }; randomSwitch = new brook::Stream< uint >( 2, rsDim ); unsigned int childDim[2] = { init->childW * puzzle->width(), init->childH * puzzle->height() }; childBreeds = new brook::Stream< uint >( 2, childDim ); childMask = new brook::Stream< uint >( 2, childDim ); unsigned int csDim[2] = { init->childW * puzzle->pieceCount(), init->childH }; childSwitch = new brook::Stream< uint >( 2, csDim ); unsigned int rbFDim[2] = { init->randW, init->randH }; rbFitness = new brook::Stream< uint2 >( 2, rbFDim ); rbFitnessLocal = new brook::Stream< uint >( 2, randomsDim ); unsigned int cbFDim[2] = { init->childW, init->childH }; cbFitness = new brook::Stream< uint2 >( 2, cbFDim ); cbFitnessLocal = new brook::Stream< uint >( 2, childDim ); randomIndices = new brook::Stream< uint2 >( 2, rbFDim ); childIndices = new brook::Stream< uint4 >( 2, cbFDim ); unsigned int copyDim[2] = { init->mainW, init->mainH }; copyIndices = new brook::Stream< uint3 >( 2, copyDim ); /** creating host memory buffers **/ _randomsLocal = ( unsigned int* )pinned_malloc( sizeof( unsigned int ) * _init->farmInit->randW * puzzle->width() * _init->farmInit->randH * puzzle->height() ); _randomIndicesLocal = ( uint2* )pinned_malloc( sizeof( uint2 ) * _init->farmInit->randW * _init->farmInit->randH ); _childIndicesLocal = ( uint4* )pinned_malloc( sizeof( uint4 ) * _init->farmInit->childW * _init->farmInit->childH ); _rbFitnessLocal = ( uint2* )pinned_malloc( sizeof( uint2 ) * _init->farmInit->randW * _init->farmInit->randH ); _cbFitnessLocal = ( uint2* )pinned_malloc( sizeof( uint2 ) * _init->farmInit->childW * _init->farmInit->childH ); _copyIndicesLocal = ( uint3* )pinned_malloc( sizeof( uint3 ) * _init->farmInit->mainW * _init->farmInit->mainH ); /** the cycle **/ // upload new random breeds and wait for it to finish _farm->randoms->read( _randomsLocal, "nocopy" ); _farm->randoms->finish(); // upload breeding indices _farm->randomIndices->read( _randomIndicesLocal, "nocopy" ); _farm->childIndices->read( _childIndicesLocal, "nocopy" ); // initiate breeding sbPrepareSwitch( puzzle->width(), puzzle->height(), *_farm->randomSwitch ); sbPrepareSwitch( puzzle->width(), puzzle->height(), *_farm->childSwitch ); sbBreedRandomsStepOne( puzzle->width(), puzzle->height(), _mainCur, *_farm->randomIndices, *_farm->main[ _mainCur ], *_farm->randoms, *_farm->randomSwitch, *_farm->randomMask ); sbBreedChildStepOne( puzzle->width(), puzzle->height(), _mainCur, *_farm->childIndices, *_farm->main[ _mainCur ], *_farm->main[ _mainCur ], *_farm->childSwitch, *_farm->childMask ); sbBreedRandomsStepTwo( puzzle->width(), puzzle->height(), *_farm->randomIndices, *_farm->main[ _mainCur ], *_farm->randomSwitch, *_farm->randomMask, *_farm->randomBreeds ); sbBreedChildStepTwo( puzzle->width(), puzzle->height(), *_farm->childIndices, *_farm->main[ _mainCur ], *_farm->childSwitch, *_farm->childMask, *_farm->childBreeds ); // initiate fitness calculation sbCalculateLocalFitness( puzzle->width(), puzzle->height(), *puzzle->piecesStream(), *_farm->randomBreeds, *_farm->rbFitnessLocal ); sbCalculateLocalFitness( puzzle->width(), puzzle->height(), *puzzle->piecesStream(), *_farm->childBreeds, *_farm->cbFitnessLocal ); sbCalculateFitness( puzzle->width(), puzzle->height(), *_farm->rbFitnessLocal, *_farm->rbFitness ); sbCalculateFitness( puzzle->width(), puzzle->height(), *_farm->cbFitnessLocal, *_farm->cbFitness ); // download randoms and children fitness, asynchronously _farm->rbFitness->write( _rbFitnessLocal, "async nocopy" ); _farm->cbFitness->write( _cbFitnessLocal, "async nocopy" ); // create new randoms _createRandoms(); // wait for the above download to finish _farm->rbFitness->finish(); _farm->cbFitness->finish(); // sort fitness and create copy indices _sorfFitnessCreateCopyIndices();

0 Likes