cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

arkanet
Adept I

Possible problem with atomic_cmpxchg?

Jump to solution

Hi guys,

I've been trying to write a kernel which multiplies a row vector into a CSR format stored matrix, but it gives a different result each time it runs. The GPU is AMD Radeon™ HD 7970 Graphics.

#pragma OPENCL EXTENSION cl_khr_global_int32_base_atomics : enable

#pragma OPENCL EXTENSION cl_khr_local_int32_base_atomics : enable

#pragma OPENCL EXTENSION cl_khr_global_int32_extended_atomics : enable

#pragma OPENCL EXTENSION cl_khr_local_int32_extended_atomics : enable

__kernel void MatMulTF(__global float *A, __global int *row, __global int *col, __global float *Rs, volatile __global float *As) {   

    int gid=get_global_id(0);

    int j=row[gid];   

    int a1=row[gid+1];

    float Rsgid=Rs[gid];

    float rp;

    int colj;

    volatile __global unsigned int *pAs;

    union {

        unsigned int intVal;

        float floatVal;

    } newVal, prevVal;

   

    for (;j<a1; j++) {

        rp=A*Rsgid;

        colj=col;

        pAs=(volatile __global unsigned int *) (&As[colj]);

        do {

            prevVal.floatVal = As[colj];

            newVal.floatVal = prevVal.floatVal + rp;

        } while (atomic_cmpxchg(pAs, prevVal.intVal, newVal.intVal) != prevVal.intVal);       

    }   

}

Can anybody help me? thanks.

Tags (2)
0 Likes
1 Solution

Accepted Solutions
himanshu_gautam
Grandmaster

Re: Possible problem with atomic_cmpxchg?

Jump to solution

Check the SrSum value it is a 9 digit number for me. compared to which 32 is negligible.

View solution in original post

0 Likes
15 Replies
arkanet
Adept I

Re: Possible problem with atomic_cmpxchg?

Jump to solution

Seriously? No answer?

0 Likes
himanshu_gautam
Grandmaster

Re: Possible problem with atomic_cmpxchg?

Jump to solution

Should not the while loop condition be flipped? i.e. while (atomic_cmpxchg(pAs, prevVal.intVal, newVal.intVal) == prevVal.intVal);      

0 Likes
himanshu_gautam
Grandmaster

Re: Possible problem with atomic_cmpxchg?

Jump to solution

Also, I fear your code might suffer from multiple problems - Non-coalesced memory accesses + Excessive atomics.

Hope you are running on the latest generation of AMD cards - which handle atomics better. (like 7970)

arkanet
Adept I

Re: Possible problem with atomic_cmpxchg?

Jump to solution

Thanks,

But I think the code is right despite its several drawbacks which are not my concern for now.

I like to loop while the old value read by atomic_cmpxchg is not equal to previously read value used to calculate the sum. (Reference: here).

The odd thing is, the code gives the right values for first iterations, but then, it goes wrong. In each run, it gives a different answer! I re-wrote it in several ways, but the problem remains. I thought I have the same problem as: Possible bug with atom_cmpxchg, but I was wrong.

PS: I have a 7970.

Any Ideas? The problem is related to atomic_cmpxchg. I'm sure of it.

0 Likes
himanshu_gautam
Grandmaster

Re: Possible problem with atomic_cmpxchg?

Jump to solution


The while() condition that you had used is correct. Sorry about the confusion.

Right now, I am unable to edit my answer... So, I will do that when "edit" starts working...

Can you post your code here?

I remember a thread on image-processing where they were seeing wrong results.....Sounds similar.

Can you post a zip attachment? That will be useful.

0 Likes
arkanet
Adept I

Re: Possible problem with atomic_cmpxchg?

Jump to solution

Thanks for your attention,

The code is very long and complicated and about 5000 lines...which part do you need? I can email it to you (It's somehow confidential). However, as I mentioned only this kernel have problem, since when I replace it with a serial code, it works correctly. (this kernel multiplies a row vector into a Compressed Sparse Row matrix)

Would you please place the link to that thread about image-processing?

Thank you in advance.

0 Likes
himanshu_gautam
Grandmaster

Re: Possible problem with atomic_cmpxchg?

Jump to solution

If this kernel alone is the problem -- Can you make a small repro case (standalone code that can reproduce)  with this kernel alone?

It might be of immense help in debugging this problem.


Here is the link on image processing:

AMD 79xx GPUs skip kernel execution for certain indices

arkanet
Adept I

Re: Possible problem with atomic_cmpxchg?

Jump to solution

The sample is a numerical case, and has iterations. The problem won't be seen in the one iteration. to build a repro case, you'll need the whole program. I can mail it to you, but I can not put it in the forum.

I think the problem is a cache-flush related problem in the atomic operation, since the condition in the while(atomic_cmpxchg(pAs, prevVal.intVal, newVal.intVal) != prevVal.intVal) seems to be always false (loop is always executed only one time).

0 Likes
himanshu_gautam
Grandmaster

Re: Possible problem with atomic_cmpxchg?

Jump to solution

Interestingly, Timchist - who started the thread I have referred above - has come back saying Catalyst driver 13.6 has fixed the issue.

Can you try your example with 13.6 - if thats not a huge ask.

0 Likes