With AMD's Naples server dies allegedly sporting 128 PCI-E 3.0 lanes, it occured to out group to revisit the question of maximum number of GPUs one can leverage in such a system, without having to jump through flaming hoops. 8 channel DDR4 sounds like a sound foundation to decent main memory bandwidth. With P2P transfer between the GPUs depending on the use case, one might be content with
- x16 / GPU = 8 GPUs
- x8 / GPU = 16 GPUs
- x4 / GPU = 32 GPUs
Some configurations will require extenders such as these Magma extenders. Now I recall that shoving that many GPUs into a single system is no small feat, due to the issue of BIOS wanting to allocate memory for every PCIE device with only 32 bits, several hundred MBs per device. With Naples around the corner and it having such a ridiculous amount of PCIE lanes:
- Is there a limit imposed by any part of the AMDGPU-PRO stack on the number of maximum GPUs one can put in a system?
- Will Naples help in regard to BIOS issues or is that strictly a matter of the motherboard vendor (extended memory and such)?
- Is there sample code in the ROCm repo to get P2P transfer? (Through ANY of the supported APIs?)