cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

ryta1203
Journeyman III

Problems with gather

Not sure why this is not compiling (get "expect gather...blah blah blah", any help woud be great, from anybody:


kernel void foo(float4 before_in[][], float4 out before_out[][])
{
float4 temp[100];
int x=0;
if (indexof(before_out) < 3)
{
// transfer to temp
for (x=0;x<100;x++)
{
temp = before_in[indexof(before_out)];
}
// do some work on temp (reads/writes) here

// transfer to before_out
for (x=0;x<100;x++)
{
before_out[indexof(before_out)] = temp;
}
}
}
0 Likes
23 Replies
Remotion
Journeyman III

Hi,

I think that for now gather is allowed only on 1D streams.

kernel void foo(float4 before_in[][], float4 out before_out[])

The access to 2D stream must be like this.

float2 id = float2(x,y);

stream2d[id] = 0.0;

Not like c++ [][] also.

 

 

0 Likes

Remotion,

Thanks, unfortunately, this did not solve my gather problem with the float4 temp, for instance, I am still getting compiler errors for

....... = temp;

due to the temp;

Error is "Semantic Check found 2 errors", both for the .....=temp lines.
0 Likes
eduardoschardong
Journeyman III

Originally posted by: ryta1203

Not sure why this is not compiling (get "expect gather...blah blah blah", any help woud be great, from anybody:


The problem is the scatter, wich seens to not be supported yet, if you chnage the out from [][] to <> you won't get this error (but need to check other parts of the code for other errors).

kernel void foo(float4 before_in[][], float4 out before_out<>)

0 Likes

Originally posted by: eduardoschardong

Originally posted by: ryta1203



Not sure why this is not compiling (get "expect gather...blah blah blah", any help woud be great, from anybody:





The problem is the scatter, wich seens to not be supported yet, if you chnage the out from [][] to <> you won't get this error (but need to check other parts of the code for other errors).



kernel void foo(float4 before_in[][], float4 out before_out<>)


This is not the problem. Also, if I change [][] to <> then the code has all kinds of problems since you can't do random access on a stream AND because of that, it really wouldn't be what I want anyways.

If I change "temp[100]" to just "temp" and "temp" to just "temp", the problem goes away, but obviously this is not what I want logically speaking.

Why would the local array be causing these problems? Are local arrays not supported? Seems odd if they are not.

My kernel now looks like:

kernel void foo(float4 before_in[][], float4 out before_out[][])
{
float4 temp[100];
float2 pos;
int x=0;
if (indexof(before_out) < 3)
{
pos.x = indexof(before_out);
// transfer to temp
for (x=0;x<100;x++)
{
pos.y = x;
t_pos = x;
temp = before_in[pos];
}
// do some work on temp (reads/writes) here

// transfer to before_out
for (x=0;x<100;x++)
{
pos.y=x;
t_pos=x;
before_out[pos] = temp;
}
}
}


Also, I thought scatter was supported, just the current massive limitation being that you have to scatter out 128 bits at a time (float4 or double2) which is what I am doing here.
0 Likes

It looks like the problem is you temporal array, Brook+ probably just do not support  such thinks until now.

float4 temp[100];

 

0 Likes

Originally posted by: Remotion

It looks like the problem is you temporal array, Brook+ probably just do not support  such thinks until now.




float4 temp[100];




 



Yes, this seems that it might be the case; however it would be great to have someone from AMD verify that.

Although, unless this is a hardware limitation, it doesn't make any sense. You could create multiple single variables, so why not be able to create an array of them?
0 Likes

Ryta, this is correct, there are no temporary arrays in Brook+ yet, however, they are available via CAL/IL and CAL/AMDHLSL. There is a cal sample called scratch_buffer_IL that shows how to use a temp array.
0 Likes

By the way where I can find AMDHLSL documentation and how to use it with CAL?

Remotion

0 Likes

I'm not 100% sure if it is in the current SDK, but if it is, there should be a sample located in the samples\languages\hlsl10 directory.

This should show how it is used within cal.
0 Likes

Micah,

Are temp arrays going to be available in the near future? This would be a very nice addition, considering CAL is much more time consuming than Brook+ and the setup overhead is quite enormous compared to some other GPGPU alternatives, such as CUDA. For example, it would take me no time at all to code and run that kernel in CUDA. I'm not trying to plug CUDA, but to be competitive with Nvidia don't you think this SDK should, at the very least, have very simple functionality like local C arrays?
0 Likes

I tried something as simple as:

kernel void foo( float4 out b4_in[][], float4 out b4_out[][] )
{
   float2 pos = float2( 1.0f, 1.0f );
   float4 tmp;

   tmp = b4_in[pos];
   b4_out[pos] = tmp;
}

This resulted in a compilation error because of the indexed assignment to b4_out: b4_out[pos] = tmp.

Are assignments to scatter streams allowed in this fashion?  I.e., indexed assignments?

---jski

0 Likes

0 Likes

That's what happens in the wee hours of the night when you're experimenting but in didn't clear up the compilation bug!  I still get the same compile-time error.  And if I comment out the assignment: b4_out[pos] = tmp, it compiles just fine

---jski

0 Likes

Micah, it doesn't do what I want to do because I need a bidirectional array in the kernel, which is not yet supported, however you are correct on the 1D scatter stream being supported, I should have caught that in the release notes earlier.

Apparently according to Micah maybe 2D scatter assignments are not working either. I think I actually might have seen this in the release notes.

Yes, this is included in the Mar-08 release notes:

Scatter
-------

Scatter to 1-dimensional targets is supported. The syntax is similar to gather
operations, in that the stream is bound using square brackets instead of angle
brackets and elements are accessed in an array-like fashion.

So scatter is only supported on 1D but gather is supported on multi-dimensions it seems. How does this work with address translation and stream size limitations? Are 2D+ streams planning on being supported? What about local arrays?

0 Likes

ryta, not sure why your's is not compiling, but assingment between float4's works fine. If you check the scatter example in the brook+ sdk, the following example does exactly what you want to do plus a little bit more. The thing that I think is different is that in the scatter example, the scatter stream is a 1D stream, whereas you have it specified as a 2D stream.

kernel void scatter(float4 a[][], float4 b<>, float width, out float4 c[])
{
// Get the position in the stream of the current thread
float idx = (indexof(c)).x;
float2 apos = {idx % width, floor(idx / width) };

// Write out to the scatter buffer
c[idx] = a[apos] + b;
}


0 Likes

Micah,

I added your code (listed below) to an existing project, simple_matmult, just to see if it compiled.

kernel void scatter(float4 a[][], float4 b<>, float width, out float4 c[])
{
   // Get the position in the stream of the current thread
   float idx = (indexof(c)).x;
   float2 apos = {idx % width, floor(idx / width) };

   // Write out to the scatter buffer
   c[idx] = a[apos] + b;
}

And got:

WARNING: ASSERT(GetResultSymbol().IsValid() + mDataTypeValue.IsValid() >= 1) failed
While processing <buffer>:66
In compiler at ResolveSymbols()[astdelayedlookup.cpp:139]
  *mName = c
Message: unknown symbol

ERROR: ASSERT(errorCount==0) failed
While processing <buffer>:115
In compiler at CompileShaderToStream()[astroot.cpp:157]
  errorCount = 1
Message: Unknown Symbols exist
Aborting...
Problem with compiling built_d/simple_matmult_simple_matmult.hlslmkdir -p built_d... 

---jski

0 Likes


Ok, this is going to sound a little silly, but I tinkered with the scatter example, and it seems that if you have multiple scatter kernels in the same file, the scatter target parameter has to be of the same name in all kernels (i.e. all have to be called "c[]"). Looks like a brcc bug. -- marcr
0 Likes

And I can't compile any scatter, even the scatter.br sample fails
The error:
Argument to indexof not a stream

And when commenting all indexof:
Output is not a stream: out float4 c[].
0 Likes

Originally posted by: eduardoschardong

And I can't compile any scatter, even the scatter.br sample fails

The error:

Argument to indexof not a stream



And when commenting all indexof:

Output is not a stream: out float4 c[].


Can you post your code? That seems odd, I don't really have a problem running their scatter sample. What hardware are you using? R600 does not support scatter as far as I am aware, you must have HD38xx+ (or of course 9170, 9250).
0 Likes

Originally posted by: ryta1203

Originally posted by: eduardoschardong
Can you post your code? That seems odd, I don't really have a problem running their scatter sample. What hardware are you using? R600 does not support scatter as far as I am aware, you must have HD38xx+ (or of course 9170, 9250).

The code is just the scatter.br...
I know I can't run it on the older GPU, but was expect to compile and run on CPU mode right?
0 Likes

I would think it should run either way (CPU or GPU), can you post the code just in case? Is it the scatter sample unmodified? Maybe your environment is not setup proplery, are you having problems running any other samples?
0 Likes

Originally posted by: ryta1203

I would think it should run either way (CPU or GPU), can you post the code just in case? Is it the scatter sample unmodified? Maybe your environment is not setup proplery, are you having problems running any other samples?


I checked environment variables and found old values of alphas in user variables (the new ones already was on system variables), I deleted than and now it works fine, thank you for the help.
0 Likes

Are int kernel parameters not yet supported? Is this a planned implementation? I have problems with my kernels everytime I try to use an int in the kernel parameter, even if I don't use the parameter at all in the kernel. Any ideas?

I don't see why this is a problem since all non-stream inputs are considered constant anyways.
0 Likes