cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

patricklin
Journeyman III

OpenCL 2.0 - SVMBinaryTreeSearch

Hi,

I read the article "OpenCL™ 2.0 — Shared Virtual Memory"

http://developer.amd.com/community/blog/2014/10/24/opencl-2-shared-virtual-memory/

And I want to reproduce the experiments in the article on Linux Ubuntu 14.04

However, AMD APP SDK v3.0 only provides OpenCL 2.0 SVMBinaryTreeSearch

Could anyone guide me to get CPU and GPU (OpenCL 1.2) version of (SVM)BinaryTreeSearch?

Thanks in advance!

0 Likes
1 Solution
dipak
Big Boss

Hi,

I'll try to show you how to implement your own OpenCL 1.x version of BST from the OpenCL 2.0 sample.

As you know, without the SVM support, you can not pass the pointer based tree structure from host to device. As a result, you've to perform few additional steps as below:

1) Represent the pointer based tree as flat array/buffer and use some kind of indexing to mark the parent-child relation.

2) Copy that array from host to device

3) Perform key searching using that indexing information and generate the result (as list of indices)

4) Copy back the result from device to host

5) Using the same indexing information, map the result to the actual nodes in BST

I'll not discuss about the 2nd and 4th steps as they are quite obvious. I'll give you an example about the 1st and 3rd steps.

Say, following are the two basic data structures:

typedef struct nodeStruct{

  int value;

  struct nodeStruct* left;

  struct nodeStruct* right;

} node;

typedef struct ocl_bin_tree{

int value;

int left, right;    

} ocl_node;

Where, "node" represents each node of the native host BST and "ocl_node" is the target index-based structure. Also, assume that there is a pointer that points to the root of the BST.

So, you may use following function to represent the BST as array:

[Note: there are many other ways to do this. Its just an example.]

void convert_tree_to_array(node *root, int num_nodes, ocl_node *ocl_tree, int *root_id, node **tree_queue)

{

  node *tmp;

  for(int i = 0; i < num_nodes; i++)  {

  ocl_tree.left = ocl_tree.right = -1;

  }

  int front = 0, rear = 0;

  tree_queue[rear] = root;

  ocl_tree[rear].value = root->value;

  rear++;

  *root_id = 0;

while (front != rear) {

tmp = tree_queue[front];

if (!tmp)  break;

    

if (tmp->left) {

tree_queue[rear] = tmp->left;

ocl_tree[rear].value = tmp->left->value;

ocl_tree[front].left = rear;

rear++;

  }

  if (tmp->right) {

  tree_queue[rear] = tmp->right;

  ocl_tree[rear].value = tmp->right->value;

  ocl_tree[front].right = rear;

  rear++;

  }

  front++;

  }

}

Now, say, you have already transferred the data such as the BST array, list of keys to be searched to device. Next step will be running a kernel that will perform the searching. Let assume that each work-item will search a key in that index-based BST array. For that purpose, following kernel may be used.

Note that, the actual searching procedure depends on how the BST array is represented.

__kernel void ocl_search(  __global ocl_node *tree,  int root_id,  __global int *search_keys,  int num_search_keys,  __global int *found_nodes_id)

{

  int tmp_node_id;

  int gid = get_global_id(0);

  int i, key;

  key = search_keys[gid];

  tmp_node_id = root_id;

  while (1) {

  if ((tmp_node_id == -1) || (tree[tmp_node_id].value == key)) {  break; }

  tmp_node_id = (key < tree[tmp_node_id].value) ? tree[tmp_node_id].left : tree[tmp_node_id].right;

  }

found_nodes_id = tmp_node_id;

}

Now, for the final step, "found_nodes_id" (where >=0) and "tree_queue" (see last argument of convert_tree_to_array(...)) can be used to map the actual nodes of the native BST tree.

Hope, I'm able to give you a basic idea. As I said earlier, there may be other better ways of doing this. Its just an example.

[Sorry, I tried, but couldn't make it any short.]

Regards,

View solution in original post

0 Likes
2 Replies
jtrudeau
Staff

Welcome! You are white listed, and I have moved this into the OpenCL forum.

0 Likes
dipak
Big Boss

Hi,

I'll try to show you how to implement your own OpenCL 1.x version of BST from the OpenCL 2.0 sample.

As you know, without the SVM support, you can not pass the pointer based tree structure from host to device. As a result, you've to perform few additional steps as below:

1) Represent the pointer based tree as flat array/buffer and use some kind of indexing to mark the parent-child relation.

2) Copy that array from host to device

3) Perform key searching using that indexing information and generate the result (as list of indices)

4) Copy back the result from device to host

5) Using the same indexing information, map the result to the actual nodes in BST

I'll not discuss about the 2nd and 4th steps as they are quite obvious. I'll give you an example about the 1st and 3rd steps.

Say, following are the two basic data structures:

typedef struct nodeStruct{

  int value;

  struct nodeStruct* left;

  struct nodeStruct* right;

} node;

typedef struct ocl_bin_tree{

int value;

int left, right;    

} ocl_node;

Where, "node" represents each node of the native host BST and "ocl_node" is the target index-based structure. Also, assume that there is a pointer that points to the root of the BST.

So, you may use following function to represent the BST as array:

[Note: there are many other ways to do this. Its just an example.]

void convert_tree_to_array(node *root, int num_nodes, ocl_node *ocl_tree, int *root_id, node **tree_queue)

{

  node *tmp;

  for(int i = 0; i < num_nodes; i++)  {

  ocl_tree.left = ocl_tree.right = -1;

  }

  int front = 0, rear = 0;

  tree_queue[rear] = root;

  ocl_tree[rear].value = root->value;

  rear++;

  *root_id = 0;

while (front != rear) {

tmp = tree_queue[front];

if (!tmp)  break;

    

if (tmp->left) {

tree_queue[rear] = tmp->left;

ocl_tree[rear].value = tmp->left->value;

ocl_tree[front].left = rear;

rear++;

  }

  if (tmp->right) {

  tree_queue[rear] = tmp->right;

  ocl_tree[rear].value = tmp->right->value;

  ocl_tree[front].right = rear;

  rear++;

  }

  front++;

  }

}

Now, say, you have already transferred the data such as the BST array, list of keys to be searched to device. Next step will be running a kernel that will perform the searching. Let assume that each work-item will search a key in that index-based BST array. For that purpose, following kernel may be used.

Note that, the actual searching procedure depends on how the BST array is represented.

__kernel void ocl_search(  __global ocl_node *tree,  int root_id,  __global int *search_keys,  int num_search_keys,  __global int *found_nodes_id)

{

  int tmp_node_id;

  int gid = get_global_id(0);

  int i, key;

  key = search_keys[gid];

  tmp_node_id = root_id;

  while (1) {

  if ((tmp_node_id == -1) || (tree[tmp_node_id].value == key)) {  break; }

  tmp_node_id = (key < tree[tmp_node_id].value) ? tree[tmp_node_id].left : tree[tmp_node_id].right;

  }

found_nodes_id = tmp_node_id;

}

Now, for the final step, "found_nodes_id" (where >=0) and "tree_queue" (see last argument of convert_tree_to_array(...)) can be used to map the actual nodes of the native BST tree.

Hope, I'm able to give you a basic idea. As I said earlier, there may be other better ways of doing this. Its just an example.

[Sorry, I tried, but couldn't make it any short.]

Regards,

0 Likes