In "Stream_Computing_User_Guide.pdf" doument, I found followin description:
Wavefronts are composed of quads, which are groups of 2x2 threads in the domain. Quads are processed together. If there are non-active threads within a quad, the thread processors that would have been mapped to those threads are idle.
So 1D domain like 1xn or nx1 configuration never get full use of the device. Is my understanding correct?
As far as I know that's correct. You should try to use 2D domain.