Your observation is correct. Since memset uses only CPU worker by-default and hence you are seeing reduction in bandwidth.
To get result closer to pcie bandwidth, use more than one CPU workers. You can use maximum as many CPU workers as number of CPU cores within your system.
The command to be used:
"bufferbandwidth -if 6 -nwk <number of CPU workers>"
Please try this and let us know whether you are getting expected pcie bandwidth speed for memset.
I've tried your command, but, unfortunately ... no significant change: memset ~ 6GBPS.
P. S. Actually, I thought, Haswell i7 is fast enough to utilize full PCIE bandwidth with just one thread, no?
My Apologies for the delay.
@no significant change: memset ~ 6GBPS.
Can you let me know ,what was number you passed as argument for nwk in command.(Theoretically it can be a maximum of 8 or 6 based on model for Haswell i7 used)
@I thought, Haswell i7 is fast enough to utilize full PCIE bandwidth with just one thread !!
To utilize full PCIe bandwidth you must use all available CPU cores which will increase bandwidth by simultaneously transferring data using all threads. By using only one thread, data has to be transferred sequentially and hence it will cause slower bandwidth.