cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

Implementation of Disjoint Sets and Kruskal's Algorithm(and other data structures) in OpenCL

I want to implement Disjoint set Data Structures and Kruskal's algorithm in OpenCL. I have implemented some codes in OpenCL and aslo implemented Kruskal's Algorithm in C, but don't know how to get started with programming the Data Structures in OpenCL. To be more specific, I don't know how dynamic memory allocation is done in the host code of OpenCL and then how these variables are passed in the kernel. I know that dynamic memory allocation can't be done in OpenCL kernel, so then how are data structures programmed in OpenCL kernels.  Djkstra's algorithm given in the book by Aftab Munshi is hard to understand. Can anyone suggest another source...?

0 Likes
8 Replies
himanshu_gautam
Grandmaster

OpenCL does not understand complex data-structures. Basic primitive and vector-types are supported though (like int, float, unsigned int, int2, uint2, int4, float2, float4 etc...)

As far as memory, cl_mem object is what a kernel understands. It is seen as a pointer by the OpenCL kernel to an area in memory,

cl_image too is more or less the same except that you can do some sampling on the object and other image related stuff.

OpenCL does not enforce what is stored in the memory area (or) how it is stored. You can store structures (or) whatever you want inside it. Your kernels can interpret it accordingly.

If you could post one specific use-case, I can help you out. Thanks,

Thanks Himanshu for your answer.

     I am actually doing Graph Based Image Segmentation using Kruskal's Algorithm in OpenCL. I  have written some filtering codes in OpenCL, but I am very naive to OpenCL. I still haven't understood lot many things in the host code, some parts like

  1) "definition of "clCreateProgram" [in the imagefilter code from the OpenCL Book Samples on Google Code]

  2)  the meaning of "cl_mem" [what exactly is stored in the variables or objects of this datatype]

Every time I write a code I just copy the host code and make changes in the kernel.   I am reading OpenCL Programming Guide by Aftab Munshi but it has not proven that useful for me so that I can start writing programs in OpenCL. Please suggest me good stuff from where I can read and understand the API of OpenCL fully and start implementing the kruskal's algorithm in it. If I get a sample code on any data structure like linked list or disjoint set to see it would prove useful.

   I also have a doubt in the function read_imageui or read_imagei. The function returns value ranging from [0...1] or [-1....1], but the pixel intensity ranges from [0...255]. How does this function return value ranging from [0...1] and everything works fine in my filtering code.

Please tell me if I should mention anything more so that you would give me the best solution.

0 Likes

To learn OpenCL, I would suggest

1. Download the OpenCL 1.2 spec and give it a read. It is exhaustive. But it will at least explain what a context, comand queue, program, kernel, events and others.

2. Download AMD APP SDK 2.8 and install it - http://developer.amd.com/tools/heterogeneous-computing/amd-accelerated-parallel-processing-app-sdk/

AMD APP SDK 2.8 will install the CPU run-time and so, to begin with, you can work with a CPU Target.

If you have a GPU in your machine, install "fglrx" driver from http://support.amd.com 

Note: maintain the order - APP SDK 2.8 first, Graphics driver next (if u have a supporting GPU)

Please go through the APP SDK Samples - You will have a fair idea of what OpenCL is all about.

Also, you can always post your questions here - if you have doubt. We will try our best to help you out,

0 Likes

I have an Nvidia Gpu inside my laptop, and I have installed its SDK. As I've told I have the codes running on my sysytem, why do you suggest to download  AMD APP SDK 2.8, I mean is there anything different on it and will it work for my cpu and gpu? Is the CPU runtime different from what I have installed..? 

0 Likes

>>  also have a doubt in the function read_imageui or read_imagei.

>> The function returns value ranging from [0...1] or [-1....1],

>> but the pixel intensity ranges from [0...255].

>> How does this function return value ranging from [0...1] and everything works fine in my filtering code.

The function also inputs a "sampler_t" argument. Check how the sampler is created first.
Read the OpenCL spec. That will help you understand this better

Can such a structure be created in OpenCL for representing edge weights of a graph:

   typedef  struct edges

     {

           int src; // source vertex

         int dest; // destination vertex

          int weight; // cost of the edge

       } 

Certainly, the API would be different, but is it possible to implement such a structure in OpenCL, or is the only solution to this problem- using vector datatypes like int4 or float4..?

Also how do we do "struct edge* E= (struct edge*)malloc(sizeof(edge))" in OpenCL...??

0 Likes

Hi,

Say you want to allocate an array of structure.

Just call "clCreateBuffer()" with "N * sizeof(struct edges)" and so you got a "cl_mem" object.

Pass it as argument to your kernel.

In your kernel, you can received this as a kernel argument "struct edges *p"

Thats all.

0 Likes

That said - GPU computing favors "structure of arrays" instead of "array of structures".

Say, Every successive workitem is accesing the "src" field of successive entries in the structure then the memory accesses of workitems will have a stride and will not be continuous. This will lead to reduced memory bandwidth utilization.

Instead, if all fields of the structure are packed together as an array - so, you will have a src[], dst[], weight[] - then memory bandwidth utlization would be good.

So, you will just have 1 structure that says:

struct soa {

__global int *src;

__global int *dst;

__global int *weight;

}.;

However, since you are new, I would advice you to first get things working. later, you can try all this as part of optimization.

0 Likes