cancel
Showing results for 
Search instead for 
Did you mean: 

OpenCL

bubu
Adept II
Adept II

Bug in CPU implementation

With ratGPU this is the render I get using the AMD CPU implementation:

ao1.jpg

and this is the correct one using the GPU and also from Intel CPU implementation.

ao2.jpg

So it seems your compiler is messing a bit some ray signs, probably due to some kind of optimization you applied in the drivers?

0 Kudos
Reply
11 Replies
himanshu_gautam
Grandmaster
Grandmaster

Re: Bug in CPU implementation

I have no idea.. But if you are using OpenCL, Can you try "-cl-opt-disable" and see if some optimizations are messing it up?

0 Kudos
Reply
bubu
Adept II
Adept II

Re: Bug in CPU implementation

Apparently, what mess the result is the -cl-fast-relaxed-math flag.

If I use -cl-opt-disable or empty options string then it works ok.

I think it's a bug in your OpenCL CPU implementation because:

1. It works ok with the Radeon GPU implementation.

2. It works ok with other implementation like Intel or NVIDIA.

Btw, I'm using Windows 7 x64 and Catalyst 13.9/13.11 beta, a Radeon 7790 and ratGPU 0.6.0.

0 Kudos
Reply
himanshu_gautam
Grandmaster
Grandmaster

Re: Bug in CPU implementation

What you do inside your runtime for "cl-fast-relaxed-math" can be vendor dependent....

Anyway, I will go check...

Thanks for testing and posting the result here,

Best,

Bruhaspati

0 Kudos
Reply
himanshu_gautam
Grandmaster
Grandmaster

Re: Bug in CPU implementation

I hear that "cl-fast-relaxed-math" must be more accurate on CPU than GPU.

Can you confirm that what you are seeing is just the opposite?

+

Any repro-case would be helpful.

I visited the ratGPU page.

My proxy failed the download....But I guess you are supplying only binaries..

Any chance you can give us a small petite repro-case?

+

What is the speed-gain you achieve using cl-fast-relaxed-math option on all platforms?

Thats probably an indication of the degree of loss of accuracy..

Best,

Bruhaspati

0 Kudos
Reply
bubu
Adept II
Adept II

Re: Bug in CPU implementation

Debugging it I see clearly the sign of some operations and inverted. Things that should be X are -X or 0.0f, causing the bug.

The fast relaxed math improves performance around 10%, because I'm making lots of MAD operations as well 1/sqrt and 1/x operations.

Repo case: download ratGPU 0.6.0 for Windows and test yourself with Catalst 13.9/11beta with only the AMD CPU OpenCL device enabled.

0 Kudos
Reply
himanshu_gautam
Grandmaster
Grandmaster

Re: Bug in CPU implementation

ratGPU is available as deb package or EXE installer.
Does it install the sources as well?

+

I need a compact test-case. If you think the signs are inverted, Can you create a small test-case that shows the problem...............?

I am planning to develop a repro-case template with the necessary host-code to launch a kernel and get output.

May be, you can then just plug-in your kernel fragment, replace some host stubs and can provide us a repro-case.

Do you think that would help?

For now,
I have just attached a test-repro-case I wrote for finding some problem in some other thread...

May be, this can give you a headstart in writing your own simple repro-case....

Thanks,

Best,

Bruhaspati

0 Kudos
Reply
bubu
Adept II
Adept II

Re: Bug in CPU implementation

Thanks for the small code template. I'll try to locate which part is causing the problem but gonna take me some time because the kernel is quite complex

0 Kudos
Reply
khan24
Journeyman III
Journeyman III

Re: Bug in CPU implementation

I had no idea, but now I have learned enough from this forum. Specially thanks to devgurus.amd.com.

0 Kudos
Reply
pinform
Staff
Staff

Re: Bug in CPU implementation

Hi bubu

I'm reviving this thread.  Do you still see this issue?

--Prasad

0 Kudos
Reply