2 Replies Latest reply on Jun 29, 2011 1:36 PM by Lonesled

    GPU PerfStudio And APP Profiler problems

    n3XusSLo

      Hello,

       

      1st problem regarding AMD APP profiler:

      If I compile a compute shader from HLSL code using D3DX11CompileFromFileA and if I use APP Profiler then it crashes when I call ID3D11Device::CreateComputeShader.

      But, If I precompile the shader beforehand and store it as binary and then I load this binary using D3DX11CompileFromFileA, it works just fine.

      This happens only for compute shaders, all other shaders work fine.

       

       

       

      2nd problem regarding GPU PerfStudio:

      If I compile any kind of a shader with D3DX11CompileFromFileA from HLSL code it crashes, but if I precompile all my shaders beforehand and load the binaries with D3DX11CompileFromFileA, it works.

       

       

       

      3rd problem:

      When debugging a compute shader Gpu PerfStudio gives me this error:

       

      Error: ShaderDebugger:  CSShaderDebuggerDestBufferDX11::Create() failed. CreateBuffer returned 80070057.

      I tried different shader compile flags but they don't seem to have any effect. This problem is only with one compute shader, I can debug others properly. Here is the shader code:

       

       



       

      #include "nxe_res//shaders//CBPresets.hlsl"

       

      Texture2D inTex0:register (t0);

      RWTexture2D outTex0:register(u0);

       

       

      groupshared float gs0[X_THREADS_CSK_SumHorizontal][Y_THREADS_CSK_SumVertical];

       

      [numthreads(X_THREADS_CSK_SumHorizontal, Y_THREADS_CSK_SumVertical,1 )]

      void CSK_SumHorizontal(uint3 tid : SV_DispatchThreadID,uint3 gtid:SV_GroupThreadID,uint3 gid:SV_GroupID)

      {

          int2 numGid=g_cbCSInt4.xy; // ConstantBuffer

          int2 numTid=g_cbCSInt4.zw; // ConstantBuffer

       

      // Load everything into GS

      gs0[tid.x][tid.y]=inTex0[uint2(gtid.x,gtid.y)];

       

      GroupMemoryBarrierWithGroupSync();

      // 1st x thread computes and writes result

      if(tid.x==0)

      {

      float sum=0.0f;

      for(unsigned int i=0;i<=tid.x;++i)

      {

      sum+=gs0[tid.y];

       

      }

       

      outTex0[gid.xy]=1.0f;

      }

      }

       

       

       

       

       

       



       

       

      Anyone has any ideas what may be causing any of these problems?