cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

genaganna
Journeyman III

Increase GPR usage with new SDK and Driver?

Originally posted by: Curious cat
Originally posted by: genaganna Are you facing any problem with cl_amd_fp64 extension?

 

 

Yes, this:

 

OpenCL Compile Error: clBuildProgram failed (CL_BUILD_PROGRAM_FAILURE).

 

Line 10: error: can't enable all OpenCL extensions or unrecognized OpenCL extension #pragma OPENCL EXTENSION cl_amd_fp64 : enable                                                                        ^

 

Could you please run MatrixMulDouble sample coming from SDK and see whether it is running or not?

0 Likes
Curiouscat
Journeyman III

Increase GPR usage with new SDK and Driver?

Originally posted by: genaganna Could you please run MatrixMulDouble sample coming from SDK and see whether it is running or not?

Yes, it runs with "--device cpu" on the command line (am on a laptop right now, no AMD graphics). So maybe it's just the SKA 1.6 that's borked.

When I try to target x86 Assembly with the SKA, I get "OpenCL Compile Error: X86 asm output is not currently supported." It does work without the cl_amd_fp64 pragma (but only produces stats for GPUs, and no x86 assembly).

0 Likes
HarryH
Journeyman III

Increase GPR usage with new SDK and Driver?

Originally posted by: genaganna
Originally posted by: Curious cat
Originally posted by: JawedDoes

 

 

 

#pragma OPENCL EXTENSION cl_amd_fp64 : enable

 

 

 

work for you?

 

 

 

 

Are you facing any problem with cl_amd_fp64 extension?

 

 

http://www.khronos.org/registry/cl/sdk/1.0/docs/man/xhtml/

http://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/

 

Weird thing: the pdf specs say use enable / disable,

the online manpages say : use require instead of enable

 

 

0 Likes
Curiouscat
Journeyman III

Increase GPR usage with new SDK and Driver?

MatrixMulDouble_Kernels.cl uses "enable", but oddly, the SKA does not complain about "require". Now I'm totally confused.

Judging by the stats though, "require" is simply ignored.

0 Likes
Curiouscat
Journeyman III

Increase GPR usage with new SDK and Driver?

I've played some more with the SKA. An example of perplexing behaviour:

Start with three kernels, call them kernel_A, kernel_B and kernel_C, which all take the same arguments and perform similar computations. Individually, they use no scratch registers and have similar throughputs; call those thru_A, thru_B and thru_C (MThreads/s).

Now combine them to a single kernel which takes the same arguments, by simply turning their bodies into blocks of the new kernel. Since there are no shared variables between the blocks, I would expect the compiler to treat each block as it treated the original kernel body. I would still expect to see no scratch register usage and throughput given by 1/(1/thru_A + 1/thru_B + 1/thru_C).

Instead, I now get plenty of scratch register usage and significantly lower throughput than expected.

0 Likes
ryta1203
Journeyman III

Increase GPR usage with new SDK and Driver?

Originally posted by: Curious cat I've played some more with the SKA. An example of perplexing behaviour:

Start with three kernels, call them kernel_A, kernel_B and kernel_C, which all take the same arguments and perform similar computations. Individually, they use no scratch registers and have similar throughputs; call those thru_A, thru_B and thru_C (MThreads/s).

Now combine them to a single kernel which takes the same arguments, by simply turning their bodies into blocks of the new kernel. Since there are no shared variables between the blocks, I would expect the compiler to treat each block as it treated the original kernel body. I would still expect to see no scratch register usage and throughput given by 1/(1/thru_A + 1/thru_B + 1/thru_C).

Instead, I now get plenty of scratch register usage and significantly lower throughput than expected.

Have you looked at the ISA and played with moving instructions around?

It turns out that simply cascading kernels is not the best way to results. I won't get into this much but it's not too hard to get the same register usage from the merged kernel as it is from the max(kernA, kernB, kernC), but you will need to look at, and possibly move, the code.

0 Likes
ryta1203
Journeyman III

Increase GPR usage with new SDK and Driver?

Originally posted by: Jawed Are you using the update version of 10.7?

update driver

Jawed,

If you are talking to me then yes, per my original post.

0 Likes
Curiouscat
Journeyman III

Increase GPR usage with new SDK and Driver?

Originally posted by: ryta1203Have you looked at the ISA and played with moving instructions around?

No. The point being that if the compiler were behaving reasonably, it would reuse all registers employed in block A when doing block B, and again reuse all registers employed in block B when doing block C. Instead, it's spilling registers. If I were a compiler developer, I would want to understand why; it may well be the same problem causing the increased register use in 2.2 vs 2.1.

0 Likes
genaganna
Journeyman III

Increase GPR usage with new SDK and Driver?

Originally posted by: Curious cat
Originally posted by: ryta1203Have you looked at the ISA and played with moving instructions around?

 

 

No. The point being that if the compiler were behaving reasonably, it would reuse all registers employed in block A when doing block B, and again reuse all registers employed in block B when doing block C. Instead, it's spilling registers. If I were a compiler developer, I would want to understand why; it may well be the same problem causing the increased register use in 2.2 vs 2.1.

 

Curious cat,

        Could you please post your three kernels here which helps us to see what is going wrong?

0 Likes
Jawed
Adept II

Increase GPR usage with new SDK and Driver?

There are two releases of Catalyst 10.7, so I can't tell which you are using.

For this reason SKA cannot be relied upon, because when 10.7 is selected internally for compilations, you don't know which release of 10.7 is being used.

(For the record: I've got no experience of any of this, as I haven't installed SDK 2.2, nor Catalyst 10.7, nor SKA 1.6).

0 Likes