cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

ruysch
Journeyman III

cheapest ATI gpu for OpenCL

What is the cheapest ATI gpu which has stable OpenCL support?

 

0 Likes
17 Replies
Fr4nz
Journeyman III

Originally posted by: ruysch What is the cheapest ATI gpu which has stable OpenCL support?

 

 

Family 5xxx has complete OpenCL support, so choose the cheapest of that family.

0 Likes
hazeman
Adept II

I think the aswer depends on what you mean by stable.

If stable means working but slowly then probably 4650 is the best choice ( or event use cpu opencl ).

But if you mean stable as allowing to use >30% of gpu power then ATI is no choice here.

Results for various OpenCL apps show that usually 58xx family has speed around nvidia 8800gt cards. ( here's some comment from folding@home developers http://folding.typepad.com/news/2010/01/some-more-details-on-the-gpu3-core-regarding-opencl.html ).

Beside 58xx cards are not good value for the money now - they are much too expensive.

So I think if you need decent performace for $ in OpenCL you need to buy some Nvidia card.

 

0 Likes

I mean stable as in nice working drivers at least under windows; 

The ATI 5670 seems perfect for my experiments

0 Likes

Originally posted by: ruysch I mean stable as in nice working drivers at least under windows; 

 

The ATI 5670 seems perfect for my experiments

 

Then why not 4850 or 4770 ? It will have comparable price and will be usually faster then 5670 in opencl ( and much faster for gaming - dx11 speed in 5670 is too slow to be of any use ). And drivers are as stable. Also you have extra support for doubles.

Don't know why to pay extra for less.

 

 

 

0 Likes

And their OpenCL support is compareable to the 5670? I want an AMD/ATI card which is fairly cheap which represents cloose to the lowest end platform I will be targeting.

0 Likes

Originally posted by: ruysch And their OpenCL support is compareable to the 5670? I want an AMD/ATI card which is fairly cheap which represents cloose to the lowest end platform I will be targeting.

 

I would say it's almost exactly as bad. 56xx has real __local in OpenCL but it's simply much slower card and it's missing doubles. So i belive that OpenCL performance should be comparable.

But if really OpenCL is important for you then why do you want to buy ATI card. OpenCL for ATI cards is really bad at the moment - and judging by they development speed it won't improve any time soon ( and by then 5xxx card will be much cheaper ).

Also for now it's unknown why 5xxx cards are so slow in OpenCL - from what I see it could be even some hardware problems/limitations with LDS or memory access. So I really don't think that for someone with limited budget 5xxx series is good choice.

 

0 Likes

Hmm....I am not sure why you think OpenCL for ATI cards is really bad at the moment.  According to Sisoft Sandra2010, ATI Radeon 5870 (a single GPU card) is 2.7 times faster than NVIDIA GTX 295 (a dual GPU card) for OpenCL.

There are also discussions in other forums that seem to confirm this fact for other apps.

Having said that, we are still heavily improving our OpenCL implementations.

0 Likes

I havent really be toying around with ATI since back when ATI 9500 Radeon was a hot potato. Developing on top notch card really isnt a primary priority of mine, I want a good base performance even on lowend cards and I think the ATI 5670 looks nice - its afforable and being able to perform well on a budget card equals excellent performance on the top notch cards. Any ways I ended up ordering the ATI 5670 and at 100$ I think Ill be able to deliever quite nice bang for the buck so to speak.

 

 

 

0 Likes

If performance isn't a must have priority and you don't plan on ever using double precision float, then the 5700 series is damn decent.. but if you ever want DP-FP ATI wants you to pony up for the 5800. It might be best to wait and see what the supposed 5830 brings to the table. The 4000 series has DP-FP but its issues with local (memory) might be an issue to some.

0 Likes

Originally posted by: hazeman I think the aswer depends on what you mean by stable.

If stable means working but slowly then probably 4650 is the best choice ( or event use cpu opencl ).

But if you mean stable as allowing to use >30% of gpu power then ATI is no choice here.

Results for various OpenCL apps show that usually 58xx family has speed around nvidia 8800gt cards. ( here's some comment from folding@home developers http://folding.typepad.com/news/2010/01/some-more-details-on-the-gpu3-core-regarding-opencl.html ).

Beside 58xx cards are not good value for the money now - they are much too expensive.

So I think if you need decent performace for $ in OpenCL you need to buy some Nvidia card.

 

 

Wow, amazing how much nonsense can get posted.

Nowhere in your link does it talk about the 58xx being compared to the 8800gt in OpenCL. Here is their update (the day after) to the link you provided: http://folding.typepad.com/news/2010/01/important-update-on-my-post-on-opencl.html

I'm also not sure why they say that AMD's OpenCL Implementation is not "fully-functioning" unless they are, of course, refering to double precision support.

On another note, I wouldn't necessarily agree with Sandra2010 benchmarks either...

...the truth lies somewhere in between. On the few apps I have looked at (non-OpenCL) the 5870 destroys the 280GTX. We don't have a 295GTX in our lab so I can't confirm anything regarding that.

People far too often use Folding@Home as a benchmark when they shouldn't (much like people try to use Crysis as a graphics benchmark).

0 Likes

Originally posted by: ryta1203

 On the few apps I have looked at (non-OpenCL) the 5870 destroys the 280GTX. We don't have a 295GTX in our lab so I can't confirm anything regarding that.

 

Nobody questions general performance of 58xx cards, but we are talking here about OpenCL apps. And cypres has it's share of problems there ( or maybe it would be more accurate to say - opencl compiler has ).

Here is link to post with opencl program which works slower on 5xxx than on 8800GT ( http://forum.beyond3d.com/showthread.php?t=55291&page=3 ). There is more posts showing strange __local (LDS) memory performance.

So regarding all those issues with OpenCL I really think best decision is to delay buying 5xxx cards. Specially that they are overpriced now ( 4xxx family gives much better performance/price ratio ).

 

 

 

0 Likes

Hazeman,

  Again, I wouldn't use some guys dx ported to cuda ported to ati as some benchmark to compare GPUs performance, that seems rather silly.

  I'd love to comment on his kernels but he doesn't post it so I can't.

  It's obvious (or should be to anyone) from reading the thread however that the guy has very little knowledge of the ATI hardware or how to program for it.

0 Likes

Originally posted by: ryta1203 Hazeman,

 

  It's obvious (or should be to anyone) from reading the thread however that the guy has very little knowledge of the ATI hardware or how to program for it.

 

Probably you are right. But on the other hand I haven't seen reports showing OpenCL numbers matching cypress performance ( the best I've seen is cypress slightly faster than 280 ).

I still have my doubts about LDS performance. You can find post here where people strugle to get it optimized ( and fail ).

Also the big problem is that ATI doesn't give any info about cache size, memory access pattern, LDS access pattern ( to avoid bank conflicts ), memory read/writes coalescing ( had to write cal kernels to test those 😕 ). So unfortunatelly with ATI writing good code is guess work.

 

0 Likes

Hazeman,

  Yes, I agree that AMD needs to come out with a solid model and not just some "optimization/performance" guideline text (like they have now). Developers are going to need more information if they are to really get close to the potential of these cards for GPGPU.

  Also, it seems that guy was using the same OpenCL code and just recompiling it. Again, I'm not familiar with the code itself but not all GPUs are made the same and therefore should not be coded for the same.

0 Likes

Originally posted by: hazemanI still have my doubts about LDS performance. You can find post here where people strugle to get it optimized ( and fail ).

Also the big problem is that ATI doesn't give any info about cache size, memory access pattern, LDS access pattern ( to avoid bank conflicts ), memory read/writes coalescing ( had to write cal kernels to test those 😕 ). So unfortunatelly with ATI writing good code is guess work.



Funny, most of it is already public... Ok, I agree that the documentation could be better (since AMD cpu docs are very good) and all this info be at a single place, from my memory, L1 cache size is 8kb per SIMD, L2 is 256kb and LDS is divided in 32 banks 32 bits wide (and it is actually able to perform 32 loads per cycle per SIMD), have fun optimizing it now.

 

0 Likes

Originally posted by: eduardoschardong

Funny, most of it is already public... Ok, I agree that the documentation could be better (since AMD cpu docs are very good) and all this info be at a single place, from my memory, L1 cache size is 8kb per SIMD, L2 is 256kb and LDS is divided in 32 banks 32 bits wide (and it is actually able to perform 32 loads per cycle per SIMD), have fun optimizing it now.

 

 



It's so public that we must search in strange places for it ( i found most accurate info on beyond3d ) . And btw you are not so correct with the cache size. For 5xxx it's 8kb per simd but for 4xxx it's 16kb per simd.

And if you want me to have fun with optimizing then why didn't you give me any info about read/write coalescing. Or maybe you think it's not important ( lol ). There is really quite more needed to write efficient kernels.

And this approach from ATI's regarding documentation is simply unprofessional.

 

0 Likes

Hazeman/Ryta/Eduardo,
If you can come up with a number of questions you would like answered in our documentation or things that need clarified, please post them and I'll look into getting them rectified. We are always looking to improve our documentation and having feedback on what is lacking will help us focus on what parts of the documentation needs work.

Edit:
We also added these slides recently. They are a little old, but might help in understanding the hardware.
ATI Stream Computing: ATI Intermediate Language (IL) | PPT | 05/30/2008
ATI Stream Computing: ATI Radeontm HD 2900 Series Instruction Set Architecture | PPT | 05/30/2008
ATI Stream Computing: ATI Radeontm HD 3800/4800 Series GPU Hardware Overview | PPT | 05/30/2008
ATI Stream Computing: ATI Radeontm HD 2900 Series GPU Hardware Overview | PPT | 05/30/2008

These can be found:
http://developer.amd.com/gpu/A...ages/Publications.aspx
0 Likes