This is a hard-core engineering question:
Most EPYC dual CPU server vendors use ~48 lanes between CPUs (e.g. SMC). Some use 80 lanes from each CPU to allow for 10 PCIe slots. They use just a few lanes to talk to a BMC and other low speed IO on board (Ethernet, VGA, IPMI...).
In some cases they allow for one of the 10 slot's 16 lanes to be used instead for 4 NVMe drives. So it is either 9 GPU slots and 4 NVMe, or it is 10 GPU slots and no NVMe.
I was told there is a way to set the Infinity Fabric to allow both 10 GPUs and 4 NVMe. The only way I can imagine is if it is possible to configure the fabric to drop down to 32 lanes for CPU <-> CPU communication, and internally divert those lanes on a separate set of traces to 4 x4 NVMe. If the same traces are used for PCIe or NUMA and for NVMe, it potentially creates signal integrity problems unless there is external switchable bus isolation. The latter looks unlikely.
Does anyone here or at AMD, if listening, have specific guidance on this issue? I wanna buy some servers, but I am having trouble getting a definitive answer on this. There are no servers with an extra slot for a SSD controller to provide the 4 NVMe along with 10 GPUs. The CPU fabric should be doing this controller job. Most all VAR motherboards do NOT do this. The VAR's marketing literature does not mention this 10+4 as a possibility, only 10+0 or 9+4. Only one does claim it can be done.
Should I buy from them or run for an exit?