Yes, that's roughly right. If you don't specify it yourself you don't know what the runtime will select and you may not know how much local memory to allocate for each work group.
On the other hand, 64 is almost always the right answer on AMD hardware in my experience. When you create a workgroup equal to the size of the hardware thread (wavefront) you remove synchronization overhead and gain performance overall.
For all practical purpose, all OpenCL programmers specify the workgroup size when enqueuing a kernel. (just like how you do in CUDA).