So, it turns out that the problem was in my kernel.
I was passing four different opencl images, and choosing one of them at runtime based on global work item id.
The chosen image was stored in a local image variable. But, it looks like the AMD compiler didn't like that.
Solution was to work with original variables instead of the local variable.