From a search on the ROCm github org, __ldcg only has an implementation for half & half2. Even then it doesn't do anything special with the pointer. It's probably to make a certain piece of hipified code compile.
I don't see any clang builtins for AMDGPU that look obviously related to loading data or cache policy, except for gfx12, though there are some for cache invalidation.
If you check the code for composable_kernel there they use llvm buffer load&save intrinsics which have access to the GLC/DLC/SLC bits which affect caching as per your GPU's ISA docs. The instrinsics themselves are defined on llvm's github here in IntrinsicsAMDGPU.td