Currently CodeXL supports two levels of profiling only -1) API timeline trace and 2) kernel level performance counters. Line by line profiling is not supported, but most of the cases those profiling methods provide much information needed to analyze the performance bottleneck.
In static analyzer mode, CodeXL supports navigating through the ISA code to see the estimation for instruction cost in clock cycle. It also provides a good way to analyze the kernel code in detail, though some knowledge about the ISA is required for that.
Thanks.