As I know, the HIP compiler generates native GCN ISA which can be directly executed on the AMD GPUs. So, there should not be any performance penalty for using HIP.
Here is a great discussion which explains it in more detail: OpenCL support · Issue #90 · ROCm-Developer-Tools/HIP · GitHub
Are any of the other languages more or less performant ?
By default, the hipcc compiler invokes the hcc compiler (Home · RadeonOpenCompute/hcc Wiki · GitHub ). HC C++ API, which is the default C++ compute API for the hcc compiler, may be an alternative (it lacks the CUDA portability though).
By the way, ROCm Github site is the best place to report/post any issue/query related to ROCm. For any query related to HIP, please post here: Issues · ROCm-Developer-Tools/HIP · GitHub