6 Replies Latest reply on Mar 7, 2011 5:15 AM by nareshsankapelly

    AMD APP FFT - generated code

    FrodoTheGiant

      Hi,

      I'd like to use AMDs APP FFT without all the overhead (lib/dll). Therefore I thought I simply take the generated OpenCL code - that can be extracted to a file - from the demo program.

      Question: What's the use of this cb_t structure? Especially of this cb[1].u field in this line here:

      uint last_row = 128 * cb[1].u;

      How do I have to initialize this cb structure? It seems "f" and "i" are not used at all.

      ... typedef union { float f; uint u; int i; } cb_t; __attribute__((reqd_work_group_size(64,1,1))) __kernel void fft_fwd( __constant cb_t *cb __attribute__((max_constant_size(32))), __global float2 *gcomplx_in, __global float2 *gcomplx_out) { __local float lds[1024]; __global float2 * gpc; uint tbase; uint i0, j0; uint gid = get_global_id(2); gid = (gid * get_global_size(1)) + get_global_id(1); gid = (gid * get_global_size(0)) + get_global_id(0); uint me = gid & 15U; // work item within vector uint row_index = gid >> 4; // vector index uint last_row = 128 * cb[1].u; uint offset = min (row_index, last_row-1); offset = ((offset >> 7) << 15) + ((offset & 127) << 0); gcomplx_in = gcomplx_in + offset; offset = ((row_index >> 7) << 15) + ((row_index & 127) << 8); gcomplx_out = gcomplx_out + offset; ...