sdar modified the samples (reduced) and modified the tiles to increase the render speed. The accepted practice is to simply render using existing settings as to get a fair comparison between hardware.
I did not change the settings in any way so my results can be directly compared against any nvidia card that has done the same test without modifying settings ( e.g. a GTX 570)
That's for the amount of samples tested.
For example 2048 samples with a 7950 boost in mike pan bmw model takes 34 Minutes 02 Seconds my old core2quad using Opencl takes 31minutes 23seconds and an Nvidia takes around 5 to 9 minutes.
I've seen some tests where an i7 2600K was ~37% faster than an AMD 7950 rendering in cycles using OpenCL that's not a good result for a 7950.
Thank you himanshu for all your work! We made big steps; but still, it's not working 100%. The Cycle Devs have disabled many features on AMD hardware to ensure it compiles on AMD cards (they can be enabled under \Path\to\Blender\2.69\scripts\addons\cycles\kernel\kernel_types.h). So, more or less we're working with a crippled kernel. These are the current default features, which are working with our AMD hardware (even if it's slower than CPU rendering - it compiles and runs fine):
Also defined as __KERNEL_SHADING__ in short.
However, all the other kernels (Nvidia OpenCL, Intel OpenCL, Nvidia CUDA) are running with the full feature-set of Cycles without any trouble. These features are:
Also defined as __KERNEL_ADV_SHADING__
These features are all modular; so one can enable and disable them one by one. This is what I've done and what I want to share with you. Only enabling __KERNEL_SHADING__ alone works fine with AMD hardware. If you additionally enable __AO__, things still work fine. Also if you additionally enable __ANISOTROPIC__. But as soon as you enable __TRANSPARENT_SHADOWS__ too, Blender gives you the "Insufficient Private Resources" error, which is not a Blender error, but related to the AMD compiler running out of memory. Enabling __TRANSPARENT_SHADOWS__ together with __KERNEL_SHADING__ without __AO__ and __ANISOTROPIC__, the kernel compiles and runs fine. So, there is clearly a limitation of the compiler. Otherwise Intel's and Nvidia's OpenCL compiler would run into similar problems. Now, the compiler runs out of memory with 3 additional features enabled. In total, we got 10. Even if the Cycles devs give their best and optimize the code further, it stillwon't compile due to insufficient private resources error.
Again, there are very big steps in the right direction and we thank you for that and also for your support himanshu! However, there is still work to be done by AMD, as this cannot be seen as fixed yet.
//Edit: A brief overview of my system: Intel Xeon E3-1245v2 Ivy-Bridge, 16GB DDR3-1600, Radeon HD7970GHz, Windows 7 x64 Professional with latest updates, Blender 2.69 RC3 and Catalyst 13.11 Beta.
I need to know how to enable these features while working with Blender.
Should I have to compile blender source code?
Is it something that I can configure while working with Blender? This is easy because I can easily givem them a repro-case.
If I have an easy repro case , I can tell AMD engineers -- Please test with these features enabled.....
And it will be a good test-case for them.
Will enabling these features increase the speed of Blender on AMD cards?
(or) Will they just enable some additional features that will make the render more beautiful?
There are working on Blender issue without unsure to be well connected to developpers ! How do they want to resolve the bug as soon as possible ? To get best performance in battlefield and Tomb raider their unsure to work with game dev unsure to make AMD speciall feature like TRESS FX and MANTLE well implemented in render engine (frostbite) . They can't said that they dn't Know the proceedings to solve software issue ! blender CYCLES is not the only one that uncounter this wrong compiler ! there are many other software that run without opencl acceleration because bad support !
====>> since the biginning i tought That they waiting for HSA GPU and APU to solve the bug once for ever ! it is a good investment but please play fair with US ?
People will never throw away their Radeon card indeed HSA work fine !
=====>> About optimisation there is a well optimised sofware called SLG 4.0 (Luxrender part) that promote Radeon 7XXX serie with Luxmark 2.0 bench. AMD must try to Help them to have a proof of the power of radeon in Render engine sofware ! The Render engine is faster enought but Dev encounter many bug in compiler and their are not supported. Luxmark give to radeon a big impact in CG world why not give them a good support !
It's simple to enable or disable features of Cycles. Just open the file "C:\Program Files\Blender Foundation\Blender\2.69\scripts\addons\cycles\kernel\ kernel_types.h" (can be in notepad), and under "# ifdef __ KERNEL_OPENCL_AMD__" just to uncomment que pieces of code. Example: instead of
# define __ EMISSION__, place
/ / # define __ EMISSION__
And so on.
To your first question:
No, you don't need to build Blender from source. It's the definition file of the OpenCL kernel, which is built everytime we try to render something. You might need to restart Blender after changing the file; but there are no difficult tasks involved. It's really trivial and easy to do. I think it's also possible to build the OpenCL kernel without running Blender at all - but that's another question. I'll show you, how to enable advanced kernel features in latest Blender 2.69 RC3:
First of all, access your default Blender directory; then change directory to the cycles kernel directory and open up kernel_types.h. If you installed it under the default location, it can be found under:
C:\Program Files\Blender Foundation\Blender\2.69\scripts\addons\cycles\kernel\kernel_types.h
Now, look for the following block:
As you can see, some features are commented out. By removing the double slashes, you enable a feature. After modifying and saving kernel_types.h, configure Blender to use the GPU (from a commandline, run "cd C:\Program Files\Blender Foundation\Blender\", "set CYCLES_OPENCL_TEST=all" and then "blender.exe"; open up User Settings in Blender and set the computing device to "Tahiti" (for the HD7970)). Again, I've succesfully tested uncommenting __AO__ and __ANISOTROPIC__ at the same time and it worked. But as soon as I additionally uncommented __TRANSPARENT_SHADOWS__, I got the "Insufficient Private Resources" error upon compiling the kernel (which is done before the actual rendering). When I disable __AO__ and __ANISOTROPIC__ again and leave __TRANSPARENT_SHADOWS__ enabled, it runs fine again. Enabling features is an easy, fast and trivial task. Hopefully, you can reproduce it and forward it to the Devs, so they can investigate it further.
To your 2nd question: I don't know what all of the advanced kernel features do in detail. But afaik, they GREATLY improve the final render (and not the speed). E.g. the __HAIR__ feature enables us to calculate / render hair particles on the GPU. When we do production-ready renders, there are 10k to 100k's of hair particles. Rendering it on the GPU significantly improves rendering times - and productivity in general. __TRANSPARENT_SHADOWS__ will help us calculate very complex and good looking shadows on the GPU. __PASSES__ for example allows us to render the scene in different layers on-the-fly and give us a very advanced tool for composition. These features are essential for every Blender user. There might be wrong information, but this is what I know.
Again, thank you for your great support!
I found the solution. But we are in discussion about blenderartist (cyclos vs LuxRender) apparently LuxRender is working out better with each catalyst. What is this trying to see if we can get a better helps LuxRender, in my opinion would be better to leave the support cycles and give importance to LuxRender. The important thing now would be to LuxRender out in real time as it was used ((slg live!)). Also in the use of materials a bit more flexible in terms of gpu I also like to use the onboard Terms pc but generally doing very well so hopefully follow. Also we wonder about Manto in terms of creating a render engine and see if it is possible.
Ya encontre la solucion. Pero en blenderartist estamos en discusion sobre (cyclos vs luxrender) aparentemente luxrender esta funcionando mejor con cada salida de catalyst. Lo que se esta intentando de ver si es posible tener una mejor ayuda a luxrender ,en mi entender seria mejor dejar el apoyo a cycles y dale importancia a luxrender. Lo importante ahora seria que luxrender fuera en tiempo real como se usaba (( slg live! )). Tambien en el uso de materiales un poco mas flexibles, en terminos de gpu me gustaria usar tambien la onboard de la pc pero en terminos en general anda muy bien esperemos que sigan asi. Tambien nos preguntamos acerca de Manto en terminos de crear un motor de render y ver si es posible.
Guys, please stop spamming this thread with irrelevant LuxRender things. This is about Cycles and big kernels in general! Dropping Cycles Support in favor of LuxRender? Are you serious? We finally got AMD's attention on Cycles and their OpenCL compiler, don't try to destroy what we have achieved! As soon as they've fixed / improved their OpenCL compiler, THEN we can talk about speed ups. But not now. If you want AMD to pay more attention to LuxRender, please open up a new thread instead of hijacking this one.