cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

hphung
Journeyman III

A question of StreamWrite latency

Hello,

I find a strange phenomenon when I develop ATI Stream application.

As we my expect, the latency of StreamWrite depends on the level of VGA cards.

However, the strange thing is that, in my experiment, the latency of StreamWrite is "longer" for higher level VGA.

For example, using 1.4 SDK, the latency of writing a grey level HD (1920x1080) image in Radeon 4890 is 5.8ms, however in Radeon 3450, it only takes 4.8ms. (The resuts are the long term average)

Can anyone explain the reason behind this observation? Is that a driver issue or just due to different architectures of different VGAs.

 

Tags (2)
0 Likes
5 Replies
Raistmer
Adept II

A question of StreamWrite latency

I see higher latencies for HD4870 versus HD2600 (different hosts though).
0 Likes
gaurav_garg
Adept I

A question of StreamWrite latency

Usually data tranfer performances over PCIe are very much dependent on host configuration, specially the chipset. What results do you see with PCIeSpeedTest?

0 Likes
luxert
Journeyman III

A question of StreamWrite latency

I have a same problem,, Stream Write,,

3850 Memory is DDR3 256bit 1800MHz, and

4890 is DDR5 256bit 3900MHz..

but!!

3850 is faster than 4890 GPU..

What is it??

0 Likes
gaurav_garg
Adept I

A question of StreamWrite latency

StreamRead/Write transfer data across PCI-e, hence GPU's internal memory interface has nothing to do with this performance.

0 Likes
luxert
Journeyman III

A question of StreamWrite latency

 

My system is

CPU : Intel Core2 Quad Q6600(2.4GHz)

RAM : DDR2 2Gb

OS : Windows XP Pro SP3

SDK : Stream 1.4.0

 

I try PCIeSpeedTest..

Radeon 3850 256bit DDR3 512Mb is

 

===> Testing device 0 <===
Device type: RV670
Max resource 2D width/height: 8192/8192
Total GPU memory size: 512 MB
Total CPU cached space size: 64 MB
Total CPU uncached space size: 512 MB
GPU engine clock: 669 MHz
GPU memory clock: 700 MHz
Number of timing loops: 100
[        16 bytes] CPU->GPU= 573.868 KB/sec, GPU->CPU= 595.604 KB/sec
[        32 bytes] CPU->GPU=   1.172 MB/sec, GPU->CPU=   1.180 MB/sec
[        64 bytes] CPU->GPU=   2.350 MB/sec, GPU->CPU=   2.768 MB/sec
[       128 bytes] CPU->GPU=   5.714 MB/sec, GPU->CPU=   5.680 MB/sec
[       256 bytes] CPU->GPU=  10.999 MB/sec, GPU->CPU=  10.797 MB/sec
[       512 bytes] CPU->GPU=  22.506 MB/sec, GPU->CPU=  22.664 MB/sec
[      1024 bytes] CPU->GPU=  45.854 MB/sec, GPU->CPU=  39.639 MB/sec
[      2048 bytes] CPU->GPU=  84.010 MB/sec, GPU->CPU=  88.393 MB/sec
[      4096 bytes] CPU->GPU= 179.108 MB/sec, GPU->CPU= 177.485 MB/sec
[      8192 bytes] CPU->GPU= 346.526 MB/sec, GPU->CPU= 357.558 MB/sec
[     16384 bytes] CPU->GPU= 656.935 MB/sec, GPU->CPU= 668.644 MB/sec
[     32768 bytes] CPU->GPU=   1.411 GB/sec, GPU->CPU=   1.405 GB/sec
[     65536 bytes] CPU->GPU=   2.302 GB/sec, GPU->CPU=   1.825 GB/sec
[    131072 bytes] CPU->GPU=   2.454 GB/sec, GPU->CPU=   1.915 GB/sec
[    262144 bytes] CPU->GPU=   2.534 GB/sec, GPU->CPU=   1.964 GB/sec
[    524288 bytes] CPU->GPU=   2.578 GB/sec, GPU->CPU=   1.986 GB/sec
[   1048576 bytes] CPU->GPU=   2.599 GB/sec, GPU->CPU=   1.998 GB/sec
[   2097152 bytes] CPU->GPU=   2.614 GB/sec, GPU->CPU=   2.005 GB/sec
[   4194304 bytes] CPU->GPU=   2.621 GB/sec, GPU->CPU=   2.008 GB/sec
[   8388608 bytes] CPU->GPU=   2.624 GB/sec, GPU->CPU=   2.011 GB/sec
[  16777216 bytes] CPU->GPU=   2.627 GB/sec, GPU->CPU=   2.011 GB/sec
[  33554432 bytes] CPU->GPU=   2.624 GB/sec, GPU->CPU=   2.021 GB/sec
[  67108864 bytes] CPU->GPU=   2.626 GB/sec, GPU->CPU=   2.022 GB/sec
[ 134217728 bytes] CPU->GPU=   2.627 GB/sec, GPU->CPU=   2.022 GB/sec
[ 268435456 bytes] CPU->GPU=   2.628 GB/sec, GPU->CPU=   2.023 GB/sec
calResAllocLocal2D() returned an error when trying to allocate 536870912 bytes!
calResAllocRemote2D() returned an error when trying to allocate 536870912 bytes
(uncached)!
Peak CPU->GPU Bandwidth =   2.628 GB/sec [data size = 268435456 bytes]
Peak GPU->CPU Bandwidth =   2.023 GB/sec [data size = 268435456 bytes]

 

Radeon 4890 256bit DDR5 1Gb is

 

===> Testing device 0 <===
Device type: RV770
Max resource 2D width/height: 8192/8192
Total GPU memory size: 1024 MB
Total CPU cached space size: 64 MB
Total CPU uncached space size: 128 MB
GPU engine clock: 900 MHz
GPU memory clock: 975 MHz
Number of timing loops: 100
[        16 bytes] CPU->GPU= 733.050 KB/sec, GPU->CPU= 560.169 KB/sec
[        32 bytes] CPU->GPU= 952.406 KB/sec, GPU->CPU= 804.848 KB/sec
[        64 bytes] CPU->GPU=   1.472 MB/sec, GPU->CPU=   1.416 MB/sec
[       128 bytes] CPU->GPU=   2.617 MB/sec, GPU->CPU=   2.312 MB/sec
[       256 bytes] CPU->GPU=  12.373 MB/sec, GPU->CPU=  11.966 MB/sec
[       512 bytes] CPU->GPU=  24.790 MB/sec, GPU->CPU=  31.049 MB/sec
[      1024 bytes] CPU->GPU=  60.860 MB/sec, GPU->CPU=  54.729 MB/sec
[      2048 bytes] CPU->GPU= 114.138 MB/sec, GPU->CPU= 100.031 MB/sec
[      4096 bytes] CPU->GPU= 245.733 MB/sec, GPU->CPU= 258.360 MB/sec
[      8192 bytes] CPU->GPU= 526.369 MB/sec, GPU->CPU= 536.562 MB/sec
[     16384 bytes] CPU->GPU= 943.138 MB/sec, GPU->CPU= 734.267 MB/sec
[     32768 bytes] CPU->GPU=   1.667 GB/sec, GPU->CPU= 776.676 MB/sec
[     65536 bytes] CPU->GPU=   2.219 GB/sec, GPU->CPU= 791.132 MB/sec
[    131072 bytes] CPU->GPU=   2.475 GB/sec, GPU->CPU= 801.243 MB/sec
[    262144 bytes] CPU->GPU=   2.547 GB/sec, GPU->CPU= 805.337 MB/sec
[    524288 bytes] CPU->GPU=   2.576 GB/sec, GPU->CPU= 806.916 MB/sec
[   1048576 bytes] CPU->GPU=   2.608 GB/sec, GPU->CPU= 808.295 MB/sec
[   2097152 bytes] CPU->GPU=   2.619 GB/sec, GPU->CPU= 808.892 MB/sec
[   4194304 bytes] CPU->GPU=   2.623 GB/sec, GPU->CPU= 809.165 MB/sec
[   8388608 bytes] CPU->GPU=   2.626 GB/sec, GPU->CPU= 809.305 MB/sec
[  16777216 bytes] CPU->GPU=   2.626 GB/sec, GPU->CPU= 809.318 MB/sec
[  33554432 bytes] CPU->GPU=   2.626 GB/sec, GPU->CPU= 809.408 MB/sec
[  67108864 bytes] CPU->GPU=   2.628 GB/sec, GPU->CPU= 809.432 MB/sec
[ 134217728 bytes] CPU->GPU=   2.629 GB/sec, GPU->CPU= 809.446 MB/sec
[ 134217728 bytes] CPU->GPU=   1.314 GB/sec, GPU->CPU= 404.726 MB/sec
Peak CPU->GPU Bandwidth =   2.629 GB/sec [data size = 134217728 bytes]
Peak GPU->CPU Bandwidth = 809.446 MB/sec [data size = 134217728 bytes]

 

That result is strange..

4890's GPU->CPU speed is slow than 3850..

Please,,  Help me T.T..

0 Likes