I know that VGPRs are used for vector instruction and SGPRs are used for scalar instruction. Also, VGPR is one instance per work-item and SGPR is one instance per wavefront.
I want to know exactly the difference (ex. register size, processing method, why are registers divided into two types) between VGPRs and SGPRs when compared to NVIDIA's CUDA registers.
I would be grateful if someone could answer me on this question.