1, the loop
int len = 32;
while(len > 0) {
len-=16;
}
This loop is supposed to exit in the third time, however, it never ends.
In the second time, len becomes 0 after len-=16.
however when it gets to (len>0), len becomes a very large number.
why this will happen?
2.
__kernel
void new(__global uchar * output ,
__global uchar * input ,
__constant uint * offset) {
__global uchar *out = offset[idx] + output;
}
the value of pointer output and input are all 0, is this right?
the value of variable "out" is not right, what is the right way to process different parts of a string?
The testbed is win7 + vs2010
Thanks.
Source code is attached.
The first question is in the line 826 in AESEncryptDecrypt_Kernel.cl
and the second question is in the line 816
1. Are you sure you're using, int, or is it uint or unsigned int? For the latter types that is expected.
2. I don't really know what you're asking here wrt 0 values. As for out, you could use out = &output[offset[idx]] too but either should work of course.
In general, it helps a great deal to include the complete source, not just some made-up representative fragment in which you may have made a typo or missed something important.
And for kernel invocation issues you also need the host code.
Thanks, I have uploaded the source code, would you bother to take a look?
The first is in the line 826 in AESEncryptDecrypt_Kernel.cl
and the second question is in the line 816.
Thanks.
Although you set the content of pkt_offset on the host, and bind it to the opencl buffer, you never write it's content to the device, you still need to use clEnqueueWriteBuffer on the content at least once (i'm pretty sure, i never use USE_MEM_PTR, but afaict it's just used to pass a pointer to opencl, not the content). Same for any other such buffers.
So you're just getting junk in the kernel for any thing using pkt_offset, or any other similar buffers, which is both of your problems.
The way I use for opencl buffers is the same with the the AMD OpenCL samples. I modified the sample AESEncryptDecrypt which uses the same way for rKeyBuffer, sBoxBuffer and the related buffers.
And in fact, clEnqueueWriteBuffer is not needed, and the content in pktOffsetBuffer is right (I debugged it with gDEBugger and get the right content in kernel). What's more, even I use clEnqueueWriteBuffer as you advised, the results are still wrong.
For question one, is that make much difference for a int or uint? When "len" becomes zero, it should get out of the loop.
Thanks.
well add some printfs and see what the values really are *in all threads*.
Thanks, I have solved this, something wrong in the host code that I didn't notice:-)