It says that amd adds a new asynchronously copy function, but the user can't set the cl_command_queue_properties as CL_QUEUE_PROFILING_ENABLE. but if so, how to profile and get the costs of the events?