Block arrangement in Compute Shader

Is anyone else able to use anything other than the naive approach (64x1 block size) when using compute shader?

I'm not able to get it working 100%... I'm still a little confused regarding the terminology (blocks, groups, rows). Can someone explain this?