cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

eduardoschardong
Journeyman III

6870's wavefronts

Micah, the new HD58XX's?

 

Well... About the new HD68XX's, what improvements for compute?

0 Likes
MicahVillmow
Staff
Staff

6870's wavefronts

Sorry, typo on my part, It should be HD68XX. The HD68XX has 12/14 SIMD's and has improvements which lower the cost of thread scheduling. This means that flow control clauses don't don't require as many cycles.
0 Likes
gat3way
Journeyman III

6870's wavefronts

As far as opencl is concerned, do we have the same memory limits as 5xxx? E.g LDS, __constant, etc.


What about double precision?

0 Likes
MicahVillmow
Staff
Staff

6870's wavefronts

The HD68XX cards do not have double precision and the hardware memory limits have not changed.
0 Likes
nou
Exemplar

6870's wavefronts

Originally posted by: MicahVillmow Sorry, typo on my part, It should be HD68XX. The HD68XX has 12/14 SIMD's and has improvements which lower the cost of thread scheduling. This means that flow control clauses don't don't require as many cycles.


great. can we expect more GPGPU optimization on Cayman than Barts?

0 Likes
empty_knapsack
Adept II

6870's wavefronts

It turns out that it's possible to compile IL code to new 6XXX ISA at least from Catalyst 10.6. New targets were added to calclCompile() functions from 12 to 19. While 12-14 and 17-19 producing code exactly the same as for Cypress/Juniper (only header differs in 1-4 bytes) and probably one of these matching the Bart's ISA, 15 and 16 is totally different story. For example, some code compiled for 5XXX starts as:

2 z: ADD_INT ____, R2.y, R0.w
t: MULLO_UINT T0.y, R1.z, R3.x
3 z: MOV R0.z, KC0[0].z
w: ADD_INT T1.w, R0.x, PV2.z
t: MOV R0.w, KC0[0].w
4 t: MULLO_UINT T0.w, T0.y, R3.y
5 t: MULLO_UINT ____, R1.y, R3.x
6 y: ADD_INT ____, T0.w, PS5
7 w: ADD_INT ____, R1.x, PV6.y
8 z: LSHL ____, PV7.w, (0x00000006, 8.407790786e-45f).x
9 y: ADD_INT T0.y, T1.w, PV8.z

And for target == 15 it became:

2 x: MULLO_UINT ____, R1.z, R2.x

y: MULLO_UINT ____, R1.z, R2.x
z: MULLO_UINT ____, R1.z, R2.x
w: MULLO_UINT ____, R1.z, R2.x
3 x: MULLO_UINT ____, PV2.y, R2.y
y: MULLO_UINT ____, PV2.y, R2.y
z: MULLO_UINT ____, PV2.y, R2.y
w: MULLO_UINT T0.w, PV2.y, R2.y
4 x: MULLO_UINT ____, R1.y, R2.x
y: MULLO_UINT ____, R1.y, R2.x
z: MULLO_UINT ____, R1.y, R2.x
w: MULLO_UINT ____, R1.y, R2.x
5 y: ADD_INT ____, T0.w, PV4.z
z: ADD_INT ____, R3.y, R0.w
6 x: ADD_INT T0.x, R0.x, PV5.z

32-bit multiplications in each of XYWZ units and there no references to T unit anymore. I guess that's the Cayman we're looking for. Though if it'll contain 16 thread processors (as current GPUs) with 4 stream cores each (vs current 5) value of 1760 SP (speculated ofc) for 5950 looks weird.

0 Likes
Gipsel
Adept I

6870's wavefronts

Originally posted by: empty_knapsack It turns out that it's possible to compile IL code to new 6XXX ISA at least from Catalyst 10.6.


Yes, that was discussed over at Beyond3D starting here.

0 Likes
empty_knapsack
Adept II

6870's wavefronts

The funniest thing that this 4D VLIW compilation available from Catalyst 10.4 (the same time ATI broke support for 2nd core of 5970) but nobody discovered it till this October. AFAIK.

0 Likes
Gipsel
Adept I

6870's wavefronts

Originally posted by: empty_knapsack The funniest thing that this 4D VLIW compilation available from Catalyst 10.4 (the same time ATI broke support for 2nd core of 5970) but nobody discovered it till this October. AFAIK.


Personally I've seen the references to the Northern Islands codename(s) and that the support for the t lane is going to be dropped in the Catalyst 9.8 for the first time (may have been in there even slightly longer, was too lazy to check; there was an error message saying that issuing instruction to the t lane is scheduled for removal in Northern Islands), i.e. right at the Cypress launch. But I've not tried if the compilation actually works (I doubt it a bit as several NI specific instructions were added only later on). I saved that for the launch of the HD6800 line

0 Likes
bayoumi
Journeyman III

6870's wavefronts

Is AMD abandoning double precision in its furture cards, or is this 6XXX card line a special case ?

0 Likes