There is one vector add in my brook source code:
y = x + uint4(1u, 2u, 3u, 4u);
Both x and y are uint4.
I complied the code with brcc, part of the generated IL code is:
The generated IL code looks like quite inefficient. With my basic understanding of IL, they can be optimized like this:
Is there any way to write the .br source code to instruct the brcc compiler generate the compact verision of IL code above?