Hi all,
I'm trying to implement parallel cuckoo hashing in opencl (gpu only) based on the dissertation found here Dan Alcantara's homepage >> Real-time Parallel Hashing on the GPU
Since it is part of a bigger project, I decided to create a standalone application. The thing is I discovered additional errors on the standalone, so I'm gonna have to post these ones first.
So the problem is in this piece of code:
prefixsum(BCount, BStart, totalbucks); //do the prefix sum to get the starting index for each bucket
barrier(CLK_GLOBAL_MEM_FENCE);
if(gid < SIZE) {
// Store the key and its index in the new global position
offsets[gid].x = offset;
offsets[gid].y = bucket_id;
offset += BStart[bucket_id]; // BStart gives back wrong values, although bucket_id is correct
Shuffled[offset].x = key;
Shuffled[offset].y = gid;
}
For some reason, BStart returns many zeros (it shouldn't), although prefix sum for BCount and bucket_id are correct (have been exported to cpu and then to file and checked).
Attached is the code for both host and device.
I'm working on Xubuntu 12.04 64-bit 3.2.0-48-generic
GPU: AMD Radeon HD 7700 Series GHz Edition
Driver Version: Catalyst 13.4
I've tried on Windows 8 64-bit with same gpu and latest drivers (downloaded about a week ago) and encountered the same problems.
Thanks in advance,
Andrew
Edit: Forgot to attach the code!