I would have a highly theoretical question: how big of a challange would it be to enhance GCN architecture to be x86 compatible?
I'm thinking of the following... all vendors are aiming the Fusion of CPUs and GPUs, just in slightly different manners. Intel had the idea of using x86 cores for doing graphics, and tried to create the smallest possible core to create a 10-12 core CPU (at first) that could dynamically dedicate each cores as x86 or as GPU cores. We all know how that story goes, many failures until Phi has been released as the first viable product, but actually they have not implemented a matching SW renderer for any of the graphics APIs, so it is practically uncapable of modern graphics (sadly). How challanging would it be to tackle the problem from the other side? How self-defeating would it be to enhance the GCN architecture to semi-efficiently do x86 instructions?
Naturally most of the surrounding would have to be altered, the instruction feeder, the branching unit and the GCN cores cores themselves have to recieve additional wiring to be able to do x86 instructions collectively. I know a lot less about CPU architectures (although I do read the articles about new architectures when they appear), so I do not know how complex things can get, but I'd figure there's a reason why the cores are that much larger and why it takes roughly 100X more power to do one instruction on a CPU than on the GPU. Could it be that similarly as to how 64-bit instructions are done via joining 2 GCN cores, maybe the x86 instructions could be done joining 2 or more of them. Naturally the SSE and AVX instructions would not be that hard to implement, since AFAIK they always issue the same instructions down each lane, and since there are 16 scalar cores coupled together, it cannot be that hard to do vector operations together, or perhaps the 4 seperate bundles of 16 cores could serve as 4 lanes of an SSE operation. Naturally the cores could not operate in GPU and CPU mode at the same time (similar to how Wavefront switching is done), so the CPU mode would be exclusive to an entire CU at a time, and it would be a very special mode of the instruction feeder also, and in this sole case the GCN cores would do a very limited form of branching, where they do different parts of an operation. Although many things are different, the 64k LDS could serve as L1 cache to the cores, saving die space, etc. I have the feeling many things could be reused that is already working. The biggest challange I feel would be coming up with the collective behaviour that result in x86 operations in the fastest way possible, branching, peeking ahead instructions, etc... all the things that make a CPU latency optimized.
I understand that the resulting GPU would be a gigantic overkill for graphics in terms of knowledge, but it would be a very dumb CPU. However, it could be one of the most advanced GPGPU graphics architecture, and it could be a 16-32 core CPU in a notebook even, with moderate single core performance, and humongous parallel compute power, not to mention it's fully dynamic nature in terms of CPU-GPU relative performance.
I know that AMD would never comment on ongoing research, even if there was such a project as the next step of Fusion, but forumers and AMD empoyees alike could comment on how big nonsense are the things I just said? Is the entire idea self-defeating, or is just all too hard to do, or what are your opinions?