As we know that clFinish() waits until all previously commands in a command queue have completed and clWaitForEvents() also waits for commands identified by event objects to complete. It means either clFinish() or clWaitForEvents() can be used as a synchronization on host device.
So why we have two different separate functions?
And how clFinish() is different from clWaitForEvents()? Does clFinish() do something more than clWaitForEvents() ?
In order to use clWaitForEvents you need to have access to all previously enqueued events. Thus you can't use clWaitForEvents if you'd like to do host synchronization within a function where you don't have access to all events for previously enqueued memory transfers and kernel launches. In that case you can still use clFinish to wait until all previously enqueued events have finished.
Besides that it can be more convenient to just use clFinish if you don't want to explicitly figure out what events you need to wait for in order to achieve full synchronization.
clFinish will wait for the *last* entity in the command queue to finish. clWaitForEvents might wait on one in the middle, such that you've enqueued 20 tasks and are waiting on the 10th. You're not going to have a pause while the device is waiting for you to give it more work because you already pre-batched work. You might also wait for events in multiple queues at once, although clearly enqueuing the last task in a queue to depend on tasks in other queues would achieve something similar.
I tend to teach people to always get in the habit of using events. If you want to use a low-level API like OpenCL it's worth using it properly and setting up task graphs as early as possible.
If you are not thoroughly marking tasks with events and exposing those, then clFinish is a convenient shortcut.