Hi Michael,
First, I assume that the write in the first step is a normal cacheable write. If it is, then it is not weakly ordered, but do read on! If it is a non-cacheable write then yes, you will have to use an SFENSE instruction or some other synchronization scheme.
To be safe , you could add my steps 1.1 and 2.1 below. This will insure that if the flag has been written, then the variable has also been written, again assuming all are cacheable writes.
For example:
1. CPU A writes some global variable (and the write happens to stay in the write buffer for a long time)
1.1 CPU A writes a global flag.
2. CPU A sends an IPI to CPU B
2.1 CPU B reads the global flag to verify it is set. If it is not, insert more code here [keep reading the flag, allow for a failure, etc.]
3. CPU B's IPI ISR reads the global variable.
Regards,
Randy [AMD]