I'm using OpenMP to parallelize my program. Now I want to want to exploit the vector feature of my CPU (SSE instructions) in certain spots within the OpenMP parallel region. I know I can do this using "intrinsics" but I want to keep my code as portable as possible. So my idea is to use OpenCL to vectorize the code. Of course OpenCL should not create additional threads within the OpenMP parallel region. Is that possible?