What for? Is it so necessarily? I think that in any case you won't be able to influence on such a thing, at least if you're working in Windows. But if you're working in your own OS, then you'll just need to make sure that number of CALLs is equal to number of RETs.
Ok, then must exist some kind of sequence of instructions to reset those CALL/RET counters, but I think that it will be extremely hard to find one. So do experiments .
Indeed we (I) don't know how this optimisation operates exactly; I've given the question some more thought, ISTM the implementation has to be such that my original question makes little sense, and that the 'balance' has to be automatic - i.e., the processor must keep track of the conventional stack versus internal call stack, and invalidate the call stack if it is out of sync when executing a RET instruction. Must also take care of memoy modification (self modifying code) IF the internal mechanism keeps track of instructions at the return locations, whether as raw bytes or predecoded (like, I think, done in certain competitor's processors).
Whatever the details, it now seems likely my initial concern about this feature was moot if not totally absurd ;=)
Thank you for open-minded discussing, kind regards,
Well, I think it must do more or less as I've suggested in order for the optimisation to preserve correctness while sparing unneeded access to main memory (or even cache). And like you say, it means there is no need to, and probably no way either, to "reset the balance" as originally worded : it must be auto-adjusting, else would be quite useless. Anyway, it would certainly be interesting to see an in depth analysis of this and other internal gears. There once was a book written by insiders, "the anatomy of a microprocessor" (title possibly not exact, this is from my fatigated memory) that went into details of the workings of the K6-2 (and 3). Unfortunately I only had briefly access to that excellent book, years ago. It sure would be great to have an updated book describing the 'anatomy' of Athlons and successors; was such a book ever published, or would one be on the press ?
Originally posted by: indi123 Is it so necessarily?
AMD doesn't have any books planned to be published to expose the inner workings of the latest AMD processors. The next best thing to a book like that are these articles:
There is also courses that you can take from Mindshare that will give you a good details of the architecture.
Thank you, Stroia... but those references, unlike the book I alluded to, are about architecture rather than micro-architecture which is quite another affair...
BTW I found the book's references in my old notes : Bruce Shriver and Bennett Smith, "The Anatomy of a High-Performance Microprocessor - A Systems Perspective". Very recommended. Was written based on the K6-3D microporcessor (aka K6-II & K6-3) : we'd need an update based on Athlon 32/64.
While I'll concede, with regrets, that AMD is not obliged to disclose the microarchitectural details of its new designs, OTOH it must (or should) publish programming (architectural) information allowing us programmers to make use of AMD extended debugging features; otherwise the feature is quite useless and unduly taking up silicon real-estate - I've opened another thread to ask about the debg_ctrl_MSR_2, please refer to that thread : would you not provide an application note on the use of the features ?
Originally posted by: Czerno Thank you, Stroia... but those references, unlike the book I alluded to, are about architecture rather than micro-architecture which is quite another affair...
Did you check the mindshare course? I think that is the closest thing we have to what you are looking for.
I will pass on the request for an application note on these features, but please be aware that we are very resource constrained, as you can imagine. I cannot promise that we will be able to accomodate your request.
Please keep it up with you.
Thnx
BBB