cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

geekmaster
Journeyman III

Brook question

Using much memory per thread

I want to create an n queen solver using ati cards and brook+. I am thinking of suplying the device with a stream that conteins the starting conditions for each thread. But here is the problem: what if i use large board sizes like 50.000x50.000? Then each thread has to take some KB of memory. Will that slow down the application (due to memory bottlenecks)? Or each thread will continue normally since there is no data shared among the threads?

0 Likes
6 Replies
geekmaster
Journeyman III

I guess i will have to try this and see for myself...

0 Likes

geekmaster,

50000 x 50000 x 1kb  = 25 x 10^11 bytes = 2.5 TBytes while GPUs have only ~1GB memory.

No way! Better to do in chunks.

The performance of the program simply depends on the ALU:Fetch ratio.

It all depends on the algorithm you are using.

For example, if you are ALU bound Fetching wont affect performance.

And, if you are memory bound you aren't going to get any performance benefit by adding useless ALU operations (providing all else is equal).

 

0 Likes

Well the number of threads would be like 128 - 256 and not 50.000 so there is enough memory. But i just remembered that brook+ does not support local arrays so there is no way to make my program in brook. Maybe in opencl. I realy need an upgrade my 3870 is showing its age.

0 Likes

Does opencl have local array support?

0 Likes

HI geekmaster,

AFAIK,Brook+ 1.4 do support local arrays but there are some restrictions(regarding writing process).

Although I support your decision to try OpenCL.Yes local arrays are present in OpenCL and they are quite flexible to use.

Refer to openCL Spec1.1 for details.

For more info about __local qualifier in OpenCL:http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/local.html

I hope it helps.

All the best  for your Queens algorithm.

Himanshu

0 Likes

Thank you for the great support!

0 Likes