Hi,
we have a PCIe device which works correctly in a Rome-based server, but not in similar Milan-based systems.
Each system is a Supermicro H12DSG-O-CPU, with 1TB of memory.
The working system has an AMD EPYC 7302 (16-Core) Processor. The two systems in which the card does not work have AMD EPYC Milan processors; one is a 7413 (24-Core), the other a 7313 (16-Core).
The PCIe card in question is a BittWare IA-840F FPGA-based accelerator. The problem has been observed with two different cards of this type.
The OS is Linux. The same result has been observed with CentOS7.9, Ubuntu 18.04.4, and Ubuntu 20.04.4. The CentOS kernel is version 3.10.0-1160.59.1.el7.x86_64, while Ubuntu uses 5.4.0-107-generic.
The specific failure mode is that the device responds ONLY to configuration cycles, not to normal memory-mapped accesses to the BAR spaces. The OS configures the devices as expected, but all non-config accesses fail (timeout).
Given that we have tried different permutations of OS, card, and host, and found that the problem is associated with the system rather than the software or the card, we wondered whether there are any known differences between the PCIe implementations in Rome and Milan?