cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

Russian
Journeyman III

What is ATI Stream Perfomance?

How much GFlops can i get?

Hi All!

 

I would like to ask people that have try to use ATI Stream (with CAL or without), about perfomance that they got. (in GFlops please)

I have try with CUDA (may be i was stupid that i did not try ATI first) , and i got maximum (real maximum)  25% of decleare perfomance in specs.

 

Now i am trying to understand, is it possible to reach 1000 GFlops decleared for Rxx or it is a fake (sory... :-)) like a CUDA?

I have red, that some people realy got 97%. I beleve.

I need to know:

Speed of one shader processor.

 Speed with access to L1 cashe,

Speed with access to L2 cashe,

Speed with access to  global memory.

 

If it is possible to attach some simple test application for VS2005(or 2008) , it will be simply perfect.

Thanks & regards,

Dmitry

 

PS. If some one give me example with 95% of decleared perfomance, i will be bigest fan of ATI and AMD!!!

0 Likes
11 Replies
godsic
Journeyman III

?? ????? ?? ????????, ? ?????? ???? ? ???? ?? ??????. ? ? ??????? ???????? ??? FLOAT_4 ? ????? ??? 95%.

0 Likes

Originally posted by: godsic ?? ????? ?? ????????, ? ?????? ???? ? ???? ?? ??????. ? ? ??????? ???????? ??? FLOAT_4 ? ????? ??? 95%.

 

Some problems with fonts. I can not read. (pishi latinicey, forum ne ponimaet po russky :-))

0 Likes

Teoreticheski, napisav shader v kotorom slazhyvausta dva FLOAT_4 i potom pishutsa v FLOAT_4 buffer dadut etu samuu proizvoditelnost, marketing. Struktura cachei ne pohozha na CPU. Oni menshe i organizovany po drugomu. CAL odnoznachno luchshe CUDA, no trebuet ot programmera ne tolko znanie C. Zhelatelno opyt s graph API - DX ili OpenGL. Elsi on est to vse  prosto i legko, hotya IL zadalbyvaet ochen. Vspominaetsa srazu rodnoi Pentagon-1024 .

0 Likes

Eli tak, to voobsche zashibis`!!! 🙂

 

Opyta imenno s DX i OpenGL net, no est` ochen` bol`shoy opyt s raznimy DSPs, tak chto problem ne dolzhno byt`.

 

PS. Ya ochen` udivilsya kogda ubedilsya chto  CUDA - fake.

 

PS.PS. U tebya net sluchajno prosten`kogo proekta ka primera?

0 Likes

K sozhaleniu seichas vse ne so mnoi. Mogu kinut v etu sredu s moim opytom optimizacii.

A tak mozhesh zaglyanut v /doc/Stream_programming_guide.pdf. Tam est razdel HelloCAL i tam prostenkoe prilozhenie s opisaniem chego i kak. Ya s nego i nachinal. Tam zhe lezhyt pdf s specifikaciaymi IL - sintaksis i komandy.

Et tozhe nado obyazatelno prochitat.

V otnoshenii API - ono tak sebe. So vremenem napishysh classy s oblegchennymi vyzovami pod konkretnuu zadachu.

0 Likes

OK!

Spasibo! 🙂

0 Likes

Ne za chto. Nashe delo pravoe!

0 Likes

One more question.

Is it possible to save values from kernels to frame buffer directly? Or how to use a framebuffer in ATI Stream? Is it possible?

0 Likes

You can try it with DX-interoperability exposed in ATI stream.

0 Likes

@godsic
A proboval v sgenerennyj Brook'om IL svoi funktsii podstavlyat'? Eto realno voobsche ili ovchinka vydelki ne stoit? Ug ochen' neohota CAL izuchat'
0 Likes

Konechno mozhno.

Tut est dva aspecta:

1. Sledi za tem kakie imena on prisvaivaet peremennym (bufferam, konstantam itd). Potom imenno ih neobhodimo ispolzovat v vyzovah CAL ili popravit rukami esli ne nravyatsa v samom IL code.

2. Brook+ -> IL mozhet generirovat ne optimalnyi kod!!! AMD reshyla ne zamorachivatsa na optimizaciyah. (GLSL i DX shadery optimiziruutsa!).

Mozho ispolzovat SKA (Stream Kernel Analyzer - skachat sdes zhe v razdele GPU tools). U nego v levom pole pishysh Brook+ a v pravom klacaesh - otobrazit IL. Sdes zhe proizvoditsa analiz effectivnosti coda(eto osnovnoe naznachenie etoi programmy). Brook+ ochen lubit konvertirovat tipy tam gde eto ne nuzhno.

0 Likes