In recent Intel ISA documents the lfence instruction has been defined as serializing the instruction stream (preventing out-of-order execution across it). In particular, the description of the instruction includes this line:
Specifically, LFENCE does not execute until all prior instructions have completed locally, and no later instruction begins execution until LFENCE completes.
Note that this applies to all instructions, not just memory load instructions, making lfence more than just a memory ordering fence.
Although this now appears in the ISA documentation, it isn't clear if it is "architectural", i.e., to be obeyed by all x86 implementations, or if it is Intel specific. In particular, do AMD processors also treat lfence as serializing the instruction stream?