The expression '((2*255+1)/1)*8 + (2*255+1)%1 + 7' equates to 4095. Your calculation of 4606 is incorrect.
The ability to dump the kernels is just given for special case debug purposes. It is not really meant for the users to directly consume the kernel. If you are trying to compute FFT transform with our libraries and need support, then please use the library API functions to compute transforms. The documentation manual explains how the library can be used.
Yes, you are correct.