Archives Discussions

marblecanyon · ‎02-18-2012

Hi, everyone

I'm implementing an algorithm for the computation of the Chrestenson spectrum that is composed of complex numbers.

First I implemented the following kernel with hard-coded values and it returns correct results:

__kernel void fastVCT(__global float2 *vector,

uint step)

{

const uint tid=get_global_id(0);

uint op1 = tid%step + 3*step*(tid/step);

uint op2 = op1 +step;

uint op3 = op2 + step;

float2 u = vector[op1];

float2 v = vector[op2];

float2 w = vector[op3];

vector[op1] = u + v +w;

vector[op2] = u - 0.5f*v.x + 0.866025*v.y - 0.5f*w.x - 0.866025*w.y;

vector[op3] = u - 0.5f*v.x - 0.866025*v.y - 0.5f*w.x + 0.866025*w.y;

}

Afterwards, I proceeded to creating a more general version of the kernel, with parameter P being passed using compiler option "-DNAME=SIZE", for example "-DP=3", and multiplication factors being pre-computed in the host program and being passed as factor:

__kernel void fastVCT(__global float2 *vector,

__global float2 *factor,

const unsigned int step)

{

const unsigned int tid=get_global_id(0);

unsigned int i = 0, j = 0;

float2 u

;

float2 vec

;

const unsigned int op = tid % step + P * step * (tid / step);

for (i = 0; i < P; i++)

{

u = vector[op + i *step]; //fetching the appropriate values from global buffer

vec = (float2)(0.0f); //just to initialize values since it will be used as accumulator

}

for (i = 0; i < P; i++)

{

for (j = 0; j < P; j++)

{

vec += factor[i * P + j] * u; //doing the actual computation

}

for (i = 0; i < P; i++)

{

vector[op + i * step] = vec; //writing the final results back to global buffer

}

The problem with this second version of the kernel is that all of the imaginary parts in vector are returned as 0.0f, while the real part is computed correctly. I checked the values for vector and factor that are being passed and they seem correct.

I would greatly appreciate if someone could point me to a possible source of the error, since I've been trying to find it for quite a while now, but haven't been able to.

marblecanyon · ‎02-20-2012

I found the error - I didn't implement the equation for the computation of vec[] correctly. It should be:

vec += (float2)(ch[i * P + j].x * op.x - ch[i * P + j].y * op.y, ch[i * P + j].x * op.y + ch[i * P + j].y * op.x) ;

since the multiplication of two imaginary components includes i * i = -1 and therefore adds to the real part.

View solution in original post

notzed · ‎02-19-2012

There should be nothing wrong with your array stuff from what I can see.

My guess is something to do with the host code, and perhaps the initialisation of factor. I know you said you checked this though ...

marblecanyon · ‎02-20-2012

I found the error - I didn't implement the equation for the computation of vec[] correctly. It should be:

vec += (float2)(ch[i * P + j].x * op.x - ch[i * P + j].y * op.y, ch[i * P + j].x * op.y + ch[i * P + j].y * op.x) ;

since the multiplication of two imaginary components includes i * i = -1 and therefore adds to the real part.

Archives Discussions

Computation of the imaginary part of the Chrestenson spectrum