I don't know exactly what you want to do here, but if you just want each work-item to access an element, without using local memory in any particular way, then what you want is a 2D NDRange globally. What you seem to have done is create a 1D range with 200 work-items in total, with one work-group of size 200. What you should rather do is create a 200x200 2D NDRange globally. The local size is determined based on performance. Since you're not using a power-of-two size, it's probably going to be sub-optimal, but you could use a local size of say 10x10.
So you'd have something like:
uint gidx = get_global_id(0);
uint gidy = get_global_id(1);
uint range = get_global_size(0);
index = gidy * range + gidx;