I have a HD5850 card and I want to start to program it in openCL. I'm a beninner in gpu programming and in openCL.
I want to start with a search in a tree structure, the problem is that all the threads access the root, many threads access the children of the rood, and finally only few threads access a leaf. Does the card support broadcasting, and to what extent (broadcast to members of a work-group or to all work-items that run concurrently?).
If there is no broadcast, is there a better way of doing the traversal; the tree is big, doesn't fit in local memory.
(each thread is a cube that has a position in 3d space and a dimension and I want to find the closest voxel that fits inside the cube by dividing the space in 8 cubes until it's small enough)