A team of grad students at the University of Wisconsin (see miaow at GitHub) figured out:
1) Instruction Memory is programmed via ports 0x50001000 and 0x50001004.
2) SGPR Wavefront 1 is programmed via ports 0x50002000 thru 0x50002014.
My understanding of the various copy, write and load Command Processor functions is that they are in the near ballpark of these above two port operations.
Mantle does not allow the programmer to control the placement of objects in the cache hierarchy. OpenCL and OpenGL, at the higher level, thus likely also do not allow the programmer to control object placement. However, the lower level GPU documentation clearly identifies that placement is controllable.
It thus appears that my question is related to AMD's Command Processor microcode instruction OPCODE's which would appear to be exclusive to AMD and AMD's documentation of that microcode.
For R9 SI series, some of the PM4 Type 3 IT_OPCODE's do not seem to be available anywhere. The R6xx R7xx pdf had some of the codes, and I was able to find more on GitHub which were more specific to SI and Evergreen. However there are a number of codes that are nowhere to be found. The following list shows the codes that are still missing. You can see from the list that these are critical to communicating with the Command Processor on the GPU. The missing codes are likely to be in the range between 0x10 and 0x9F. I saw an OpenGL flag OGL_PM4_CAPTURE_ENABLE and thought that might be a way to discover the codes, but no documentation on that either.
// Draw/Dispatch Packets
// State Management Packets
// Command Predication Packets
// Synchronization Packets
// Misc Packets