I am creating slides for a university course and I was looking to compare various instruction latencies on CPUs and GPUs. Inside the OpenCL Optimization Guide, there is a very short table for VLIW instruction latencies. Is there any place where I could find a comprehensive table of VLIW4-VLIW5-GCN1.0-etc. instruction latencies on various HW? Same goes for Bulldozer derivate CPUs. Intel has very nice documentation on instruction latencies in their HW, but I fail to find the counterpart from AMD's side.
Anyone have a clue?