cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

Czerno
Journeyman III

how do you properly reset the balance of call/ret stack ?

call return stack balancing

Hi ! I hope this is the right forum for my question, else my apology & feel free to move it to the right forum. For several generations as far as I know, AMD x86 processors have maintained an internal cache of return address and code - or maybe two such caches, one for "near" and one for "far" calls. My question : since it is possible for the cache(s) to get out of sync [ in case call/ret are not in balanced pairs], what is the recommended/most economical way to reset the balance, assuming nothing on what the present state is ? I'd like to hear both of a solution that'll work across all X86 implementations (including the competitors) and an optimal solution specific to AMD processors - if such exist - might be using specific MSRs.
0 Likes
10 Replies
avk
Adept III

how do you properly reset the balance of call/ret stack ?

What for? Is it so necessarily? I think that in any case you won't be able to influence on such a thing, at least if you're working in Windows. But if you're working in your own OS, then you'll just need to make sure that number of CALLs is equal to number of RETs.

0 Likes
Czerno
Journeyman III

how do you properly reset the balance of call/ret stack ?

> What for? Is it so necessarily? Why of course it is necessary for performance reasons. > I think that in any case you won't be able to influence on such a thing Did you think deeply ? Can't we reset the balance whenever we need to, just make nested call/ret at a depth sufficient to fill the internal stack ? But this does not look like an efficient process does it. What does AMD say ? There has to be a good way to reset (empty) the call stack. > But if you're working in your own OS, then you'll just need to make sure that number of CALLs is equal to number of RETs.

Ridiculous : whether it is your own OS or not, unless you control ALL programs allowed to run, you can't ensure the stack will stay balanced ! Even if all programs respected the rule of nesting calls/rets one for one, what happens when a program for some reason crashes or is interrupted in the middle of such a sequence ? Furthermore, programming sequences which do not balance rets to calls are sometimes useful or even unavoidable. I'm sure most processors at any moment are in an unbalanced state in this respect UNLESS special programming measures are taken. It is about those programming measures I am enquiring ;=)
0 Likes
avk
Adept III

how do you properly reset the balance of call/ret stack ?

Ok, then must exist some kind of sequence of instructions to reset those CALL/RET counters, but I think that it will be extremely hard to find one. So do experiments .

0 Likes
Czerno
Journeyman III

how do you properly reset the balance of call/ret stack ?

Indeed we (I) don't know how this optimisation operates exactly; I've given the question some more thought, ISTM the implementation has to be such that my original question makes little sense, and that the 'balance' has to be automatic - i.e., the processor must keep track of the conventional stack versus internal call stack, and invalidate the call stack if it is out of sync when executing a RET instruction. Must also take care of memoy modification (self modifying code) IF the internal mechanism keeps track of instructions at the return locations, whether as raw bytes or predecoded (like, I think, done in certain competitor's processors).

Whatever the details, it now seems likely my initial concern about this feature was moot if not totally absurd ;=) 

Thank you for open-minded discussing, kind regards,

 

 

0 Likes
indi123
Journeyman III

how do you properly reset the balance of call/ret stack ?

Is it so necessarily? I think that in any case you won't be able to influence on such a thing, at least if you're working in Windows.

0 Likes
Czerno
Journeyman III

how do you properly reset the balance of call/ret stack ?

Originally posted by: indi123 Is it so necessarily?
Well, I think it must do more or less as I've suggested in order for the optimisation to preserve correctness while sparing unneeded access to main memory (or even cache). And like you say, it means there is no need to, and probably no way either, to "reset the balance" as originally worded : it must be auto-adjusting, else would be quite useless. Anyway, it would certainly be interesting to see an in depth analysis of this and other internal gears. There once was a book written by insiders, "the anatomy of a microprocessor" (title possibly not exact, this is from my fatigated memory) that went into details of the workings of the K6-2 (and 3). Unfortunately I only had briefly access to that excellent book, years ago. It sure would be great to have an updated book describing the 'anatomy' of Athlons and successors; was such a book ever published, or would one be on the press ?
0 Likes
stroia
Staff
Staff

how do you properly reset the balance of call/ret stack ?

AMD doesn't have any books planned to be published to expose the inner workings of the latest AMD processors.  The next best thing to a book like that are these articles:

There is also courses that you can take from Mindshare that will give you a good details of the architecture. 

0 Likes
tabu5
Journeyman III

how do you properly reset the balance of call/ret stack ?

Please keep it up with you.

Thnx

BBB

0 Likes
Czerno
Journeyman III

how do you properly reset the balance of call/ret stack ?

Thank you, Stroia... but those references, unlike the book I alluded to, are about architecture rather than micro-architecture which is quite another affair...

BTW I found the book's references in my old notes : Bruce Shriver and Bennett Smith, "The Anatomy of a High-Performance Microprocessor - A Systems Perspective". Very recommended. Was written based on  the K6-3D microporcessor (aka K6-II & K6-3) : we'd need an update based on Athlon 32/64.

While I'll concede, with regrets, that AMD is not obliged to disclose the microarchitectural details of its new designs, OTOH it must (or should) publish programming (architectural) information allowing us programmers to make use of AMD extended debugging features; otherwise the feature is quite useless and unduly taking up silicon real-estate  - I've opened another thread to ask about the debg_ctrl_MSR_2, please refer to that thread : would you not provide an application note on the use of the features ?

 

0 Likes