cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

ribalda
Journeyman III

Drop in fglrx OpenCL performance: 14.12 vs 15.5

Hello, Using the latest linux driver shows a much worse performance for float16 and double16. Is this expected?

Meassured with clpeak:

driver_version | 1642.5 (sse2) -> 1702.3 (sse2)

float16| 9.96633 -> 1.91836 Mflops

double16| 2.37198 -> 0.640011 Mflops

clinfo differences:

SVM capabilities:

- Coarse grain buffer: Yes

- Fine grain buffer: Yes

- Fine grain system: Yes

- Atomics: Yes

+ Coarse grain buffer: No

+ Fine grain buffer: No

+ Fine grain system: No

+ Atomics: No

Any pointers will be appreciated. Thanks!

0 Likes
11 Replies
dipak
Big Boss

Could you please provide a reproducible test case? Please also mention the setup details.

0 Likes

Hi

Just run clpeak on 14.12 and on 15.5

krrishnarraj/clpeak · GitHub

There are two issues

1) SVM capabilities have changed.

2) On the CPU the performance for double16 and float16 has dropped,I guess you could replicate it with any gpu

Thanks!

root@qt5022:~# uname -a

Linux qt5022 4.0.0 #1 SMP Fri Jun 5 15:50:44 CEST 2015 x86_64 GNU/Linux

root@qt5022:~# lspci -d 1002:9806 -vvv

00:01.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Wrestler [Radeon HD 6320] (prog-if 00 [VGA controller])

  Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Wrestler [Radeon HD 6320]

  Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+

  Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-

  Latency: 0, Cache Line Size: 32 bytes

  Interrupt: pin A routed to IRQ 45

  Region 0: Memory at a0000000 (32-bit, prefetchable) [size=256M]

  Region 1: I/O ports at 2000 [size=256]

  Region 2: Memory at d0200000 (32-bit, non-prefetchable) [size=256K]

  Expansion ROM at <unassigned> [disabled]

  Capabilities: [50] Power Management version 3

  Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)

  Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-

  Capabilities: [58] Express (v2) Root Complex Integrated Endpoint, MSI 00

  DevCap: MaxPayload 128 bytes, PhantFunc 0

  ExtTag+ RBE+

  DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-

  RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+

  MaxPayload 128 bytes, MaxReadReq 128 bytes

  DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-

  DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported

  DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled

  Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+

  Address: 00000000fee0300c  Data: 4143

  Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>

  Kernel driver in use: fglrx_pci

  Kernel modules: fglrx

0 Likes

Thanks Ricardo. We'll check and get back to you.

0 Likes

Hi Ricardo,

I indeed found much lower GFLOP for double16, but, GFLOPs for float16 were almost same (please see the attached files). I'll report this to driver team.

FYI: Currently, on AMD platform, OpenCL 2.0 features such as SVM, device-side en-queue etc. are not supported on CPUs. So, I guess, difference in reported SVM capability has no effect on the performance.

Regards,

0 Likes

Hello

FYI: Just tried with 15.7 and I still get the error.

Regards!

0 Likes
ribalda
Journeyman III

Hello

With 15.9, there is exactly the same error. So you have all you need to replicate the error in your side? Can I help you somehow? Is somebody taking a look to this?

Regards!

0 Likes

Yes, the issue has already been reported to the engg. team and they are working on it. As soon as I get any update, I'll share with you. Please keep patience.

Regards,

0 Likes

What could be a reasonable timeframe for fixing this bug?

Thanks

0 Likes

As I checked, the issue is still open. Sorry, I can't comment about any timeline at this moment.

Regards,

0 Likes

Update:

The dev. team has identified the possible reasons (most probably due to disabling some of the optimizations for the CPU devices and it's also expected) for the above performance impact. However, they can't provide any timeframe of the fix right now.

Regards,

0 Likes

Hi again dipak

Any news to share? At least to celebrate the 9 month anniversary of the bug?

BTW, it is also failing on 15.12

0 Likes