cancel
Showing results for 
Search instead for 
Did you mean: 

OpenCL

vmetodiev
Adept II

Counter inside an OpenCL kernel

Dear Community,

I am trying to write an OpenCL kernel that counts the matching elements between to arrays. However, I do not get the expected output.

The kernel looks like this:

__kernel void matchVectors(
	__global const uchar *a, 
	__global const uchar *b,
	volatile __global uchar *c) 
{		
	const int i = get_global_id(0);
	if ( ( a[i] ^ b[i] ) == 0 ) 
	{
		atomic_inc(c, 1);
	}	
}

a (input) - string, 10 bytes (chars) long
b (input) - string, 10 bytes (chars) long
c - single byte (the counter that should indicate how many indexes match between a[i] and b[i])

And below are some parts of the main.c (non-OpenCL) code:

#define ELEMENT_TYPE uint8_t
#define SIZE 10

ELEMENT_TYPE c = 0;

cl_mem aMemObj = clCreateBuffer(context, CL_MEM_READ_ONLY, SIZE * sizeof(ELEMENT_TYPE), NULL, &ret);
cl_mem bMemObj = clCreateBuffer(context, CL_MEM_READ_ONLY, SIZE * sizeof(ELEMENT_TYPE), NULL, &ret);
cl_mem cMemObj = clCreateBuffer(context, CL_MEM_WRITE_ONLY,       sizeof(ELEMENT_TYPE), NULL, &ret);

// Copy lists to memory buffers
ret = clEnqueueWriteBuffer(commandQueue, aMemObj, CL_TRUE, 0, SIZE * sizeof(ELEMENT_TYPE), a, 0, NULL, NULL);
ret = clEnqueueWriteBuffer(commandQueue, bMemObj, CL_TRUE, 0, SIZE * sizeof(ELEMENT_TYPE), b, 0, NULL, NULL);
		
// Execute kernel
ret = clEnqueueNDRangeKernel(commandQueue, kernel, clDimensions, NULL, &globalItemSize, &localItemSize, 0, NULL, NULL);
	
// Get the result
ret = clEnqueueReadBuffer(commandQueue, cMemObj, CL_TRUE, 0, sizeof(c), (void *)&c, 0, NULL, NULL);

// Write result
printf("Result: c = %u\n", c);

Could anyone advise what is wrong?

0 Likes
1 Solution

Can you please try the modified main.cpp file attached here to see if it is working for you?

Thanks.

View solution in original post

10 Replies
vmetodiev
Adept II

*elements between two arrays

/For some reason, I am not able to edit my initial post due to an "HTML error", sorry for the mistake!/

0 Likes

As described here, atomic_inc functions:

"Read the 32-bit value (referred to as old) stored at location pointed by p. Compute (old + 1) and store result at location pointed by p. The function returns old."

So, please try to use "int" or " unsigned int " as argument while calling the atomic function.

Thanks.

vmetodiev
Adept II

Hi @dipak ,

I tried what you suggested, both int and unsigned int. But unfortunately, the behaviour is still the same.

May I send you the code, if possible (as a private message)?

Thanks and regards.

0 Likes

Sure, you can provide the reproducible code. Also, please provide the following information:

1) setup details like OS, gpu, driver version etc.

2) clinfo output

Thanks.

vmetodiev
Adept II

Hi @dipak

I am neither able to attach the archive here in the post, nor can I see a button in your profile to send you a private message.

Could you please advise how to proceed?

Thanks.

0 Likes
dipak
Big Boss

As you joined recently, I think some of these community features are not enabled yet. 

To share the repro and other information, you can upload them to any file hosting site and provide us the download link. 

Thanks.

0 Likes
vmetodiev
Adept II

0 Likes

Thanks for sharing the download link. I will check the files and get back to you.

0 Likes

Can you please try the modified main.cpp file attached here to see if it is working for you?

Thanks.

vmetodiev
Adept II

Yes, amazing... now it works perfectly!

Thank you very much, @dipak ! I really appreciate your time and effort on my issue!

0 Likes