61 Replies Latest reply on Jan 18, 2012 4:26 AM by diapolo

    Can I use BFI_INT directly from IL ?

    frankas

      I need to use the BFI_INT instruction directly from some optimization. According to the Evergreen docs it performs:

      dst = (src1 & src0) | (src2 & ~src0)

      But the closest I can find in the IL docs are UBIT_INSERT.

      Alas ubit_insert doesn't allow me to specify the bitmask src0 directly, only to the level of widths and offsets. So if I want a mask such as 0x5555aaaa ubit_insert is no use.

      Have I overlooked something in the docs? Or is there an undocumented feature I can use to utilize this assembly level operation directly ?

       

        • Can I use BFI_INT directly from IL ?
          empty_knapsack

          I was also curious about BFI_INT (as it can fully replace several operation for one round of MD5) but looks like it didn't exposed at IL level at all.

            • Can I use BFI_INT directly from IL ?
              frankas

              Same story here, BFI_INT will also speed up SHA-1

              http://en.wikipedia.org/wiki/SHA-1

               

              BFI_INT is equivalent to the (bit)vector_select function in this optimization:

              (0 ≤ i ≤ 19): f = vec_sel(d, c, b)

              But I also found another optimization where it is useful

              (40 ≤ i ≤ 59): f = vec_sel(b, (c or d), (c and d))

              This brings this round function down to 3 cycles, which is faster than any of the alternatives listed on the Wikipedia page.

               

                • Can I use BFI_INT directly from IL ?
                  mrbpix

                  The IL instruction ubit_insert is translated to a BFM_INT+BFI_INT pair, but there is no instruction to expose the powerful functionality of BFI_INT alone. However there is a way to use it: after compiling the IL code with calclCompile() it is possible to dynamically patch the binary CAL object in memory by scanning its opcodes to replace some of them with BFI_INT. My open source whitepixel v2 project does this, feel free to read the code to see it in action:

                   

                  http://blog.zorinaq.com/?e=43

                   

                  empty_knapsack: actually BFI_INT is useful for two rounds in MD5 (F() and G()) ;-)

                    • Can I use BFI_INT directly from IL ?
                      empty_knapsack

                      Yeah, I was curious enough to do it today too . Although speed-up is a bit lower than expected (~16%) as compiler cannot pack BFI_INT in a 5x way efficiently. For my kernel utilization somewhere around 91-92%. May be 6x-8x packing can produce even better results.

                        • Can I use BFI_INT directly from IL ?
                          frankas

                          Mrbpix: Thanks, your hack is pretty awesome.

                          However, at this point I think at this point it would be appropriate that ATI comment on what appears to be an oversight in the IL language, and how they intent to remedy the situation.

                          Can we expect BFI_INT support in future SDK versions, or do we have to go down the path Mrbpix describes in order to eek out the last 15% of performance ?

                           

                            • Can I use BFI_INT directly from IL ?
                              empty_knapsack

                              Funny thing that because of IL missing instruction for BFI_INT even OpenCL's bitselect() can't be translated directly into BFI_INT.

                               

                              Anyway, these ~15% only applies to single MD5 where F & G rounds takes 32/45 of all calculations. SHA1 requires a lot more instructions and so removing 20*(3-1)=40 of them (or even +20 from 40-59 rounds) with BFI doesn't produce a huge boost -- I've got about 3% speed-up with BFI_INT usage.

                               

                              And of course ATI should add this function to IL but, as it usually happens, if they'll answer it'll 99% that "this feature will be added in next release" and "we have 3 months development cycle", etc. So I won't expect this will happens anytime soon.

                               

                              Btw, 5970+AMD Stream 2.3+Catalyst 10.12 == still broken...

                        • umar
                          airsidelimo

                          Those are the ones that I am confident will make it in 2.6. I got pulled onto some more pressing matters so I can't add any more patterns at this time.
                          Another pattern that I was working on is (C ^ (A & (B ^ C)).

                          Anyway, these ~15% only applies to single MD5 where F & G rounds takes 32/45 of all calculations. SHA1 requires a lot more instructions and so removing 20*(3-1)=40 of them (or even +20 from 40-59 rounds) with BFI doesn't produce a huge boost -- I've got about 3% speed-up with BFI_INT usage.

                          hindi songs

                            • umar
                              gat3way

                              I reinstate that OpenCL (as of 11.9 and SDK2.5) does generate BFI_INT code when certain patterns occur in the kernel (and yes, that's not mapped to bitselect()). I have my own routine for binary patching BYTEALIGN with BFI_INT, I have tried disabling it and replacing amd_bytealign() with the bitwise expression and noticed BFI_INTs in the resulting ISA dump. This might have been some mistake or the backend might be able to generate BFI only in certain cases, e.g one of the arguments being a constant (which occurs in the first few steps of MD5 where A/B/C/D or part of them still equal H0...H4 which are constants). Anyway, I am still using my binary patching codepath.

                              BTW, BFI_INT might give you a bit more with MD5 when you do partial reversals to step 43 instead of complete to step 45 _AND_ when you do the (b^c) trick in round3.

                               

                              With SHA1,it's also a bit more than 3% if you do an early check in step 76 rather than comparing values after step 80 and also precalculate some of the w[X] values in host code rather than doing those calculations post step 16.

                               

                                • umar
                                  corry

                                   

                                  Originally posted by: gat3way I reinstate that OpenCL (as of 11.9 and SDK2.5) does generate BFI_INT code when certain patterns occur in the kernel (and yes, that's not mapped to bitselect()).

                                  What platform are you using?  I checked rechecked, and rechecked again the vista and linux packages, and it is nowhere to be found, so unless they starting packing their dlls for some reason, it simply is not there.  Since OpenCL generates IL code, it can't even be that its just skipping a step to ISA. 

                                  Only thing I can think of is you must be using XP, and/or the 32 bit only package (not sure if they'd be different).  Well, that or you're the only one with access to the "real" 11.9 driver package...

                                  Honestly Micah, you mention cheking the builds, and you have 2 people on here saying its missing.  Please, go double check all the parameters?  You will find something is missing for sure, it won't be a waste of your time, and you'll make a lot of us happy here!  I've specifically confirmed the vista x64 package does not contain it in 11.9, or 11.10, and the 32 bit binary distributed with the vista x64 version does not contain it.  Same goes for linux, 11.9 (there's only 1 package) x64 binaries show no bfi, nor do the 32 bit versions.  As I said, I bet there is something like -DWITH_BFI not being set....build it and check the command line output, check the spelling, check the spelling in the code.  I don't like my time being wasted, so I'm not going to waste anyone elses. 

                                  I don't know of anything else to say on this, 2 independant people confirmed its not in there.  I reported versions/platforms checked.  I can't make this any easier for you...

                                   

                                    • Can I use BFI_INT directly from IL ?
                                      corry

                                      On, now I'm totally calling BS.  Just grabbed the XP, and XP64 builds, and did the same, so I've checked every version of 11.10 for windows, and 11.9 for linux.  Its not there....just to prove my point, in the code window you'll see the strings output from the xp64 aticaldd64.dll

                                      I don't know what version you have gat3way, but its not the one released to the masses...I don't doubt your eyesight, and I see no reason that you would get on here and lie, which leaves only the aforementioned explanation, that you have some non-public version of the driver...

                                       Edit:  Apparently the strings dump is too large for the forums, so I'll just include the relevant sections, surrounded by some of the trash so you can see the output is from strings....

                                       

                                      Another edit, interestingly enough, there is an IL_OP_BFI, but no corrosponding static text with which to parse it.  So it looks like it could generate it, if it knew what it was looking for...

                                      USE_VTX_POINT_SIZE = %u USE_VTX_EDGE_FLAG = %u USE_VTX_RENDER_TARGET_INDX = %u USE_VTX_VIEWPORT_INDX = %u USE_VTX_KILL_FLAG = %u VS_OUT_MISC_VEC_ENA = %u VS_OUT_MISC_SIDE_BUS_ENA = %u VS_OUT_CCDIST0_VEC_ENA = %u VS_OUT_CCDIST1_VEC_ENA = %u ; ----------------- GS Data ------------------------ VGT_GS_OUT_PRIM_TYPE = 0x%08X ; OUTPRIM_TYPE = %u ; MergeFetchFlags = 0x%08X ; ----------------- PS Data ------------------------ ; SPI_PS_IN_CONTROL_0 = 0x%08X SPI0:NUM_INTERP = %u SPI0:POSITION_ENA = %u SPI0:POSITION_CENTROID = %u SPI0:POSITION_ADDR = %u SPI0:PARAM_GEN = %u SPI0:PARAM_GEN_ADDR = %u SPI0:BARYC_SAMPLE_CNTL = %u SPI0:PERSP_GRADIENT_ENA = %u SPI0:LINEAR_GRADIENT_ENA = %u SPI0:POSITION_SAMPLE = %u SPI0:BARYC_SAMPLE_ENA = %u ; SPI_PS_IN_CONTROL_1 = 0x%08X SPI1:GEN_INDEX_PIX = %u SPI1:FIXED_PT_POSITION_ENA = %u SPI1:FIXED_PT_POSITION_ADDR = %u SPI1:FRONT_FACE_ENA = %u SPI1:FRONT_FACE_ADDR = %u SPI1:FRONT_FACE_CHAN = %u SPI1:FOG_ADDR = %u SPI1:GEN_INDEX_PIX_ADDR = %u ; SPI_INPUT_Z SPI:PROVIDE_Z_TO_SPI = %u ; CB_SHADER_MASK = 0x%08X CB:OUTPUT0_ENABLE = %u CB:OUTPUT1_ENABLE = %u CB:OUTPUT2_ENABLE = %u CB:OUTPUT3_ENABLE = %u CB:OUTPUT4_ENABLE = %u CB:OUTPUT5_ENABLE = %u CB:OUTPUT6_ENABLE = %u CB:OUTPUT7_ENABLE = %u CB_SHADER_CONTROL:bitmap = %u%u%u%u%u%u%u%u ; DB_SHADER_CONTROL = 0x%08X DB:Z_EXPORT_ENABLE = %u DB:STENCIL_REF_EXPORT_ENABLE = %u DB:MASK_EXPORT_ENABLE = %u DB:ALPHA_TO_MASK_DISABLE = %u DB:Z_ORDER = %u DB:KILL_ENABLE = %u ; SQ_PGM_EXPORTS_PS SQ_PGM_EXPORTS_PS:PS_EXPORT_MODE = 0x%08X ; (%u color + Z ; bHasFogMerge = 0x%08X ; ----------------- CS Data ------------------------ ; NumSharedGprUser = %d ; NumSharedGprTotal = %d ; CS Setup Mode = Slow (i.e setup R0.xyzw) ; CS Setup Mode = Fast (i.e setup R0.x) ; NumThreadPerGroup = %d ; TotalNumThreadGroup = %d ; NumWavefrontPerSIMD = %d ; IsMaxNumWavePerSIMD = true ; IsMaxNumWavePerSIMD = false ; SetBufferForNumGroup = true ; SetBufferForNumGroup = false R6PLUS_ALU_MOVA_DST_CLAUSE_GLOBAL_B3 R6PLUS_ALU_MOVA_DST_CLAUSE_GLOBAL_B2 R6PLUS_ALU_MOVA_DST_CLAUSE_GLOBAL_B1 R6PLUS_ALU_MOVA_DST_CLAUSE_GLOBAL_B0 R6PLUS_ALU_MOVA_DST_CF_IDX1 R6PLUS_ALU_MOVA_DST_CF_IDX0 R6PLUS_ALU_MOVA_DST_CF_PC R6PLUS_ALU_MOVA_DST_AR_X R6PLUS_EARLY_Z_THEN_RE_Z R6PLUS_RE_Z R6PLUS_EARLY_Z_THEN_LATE_Z R6PLUS_LATE_Z R6PLUS_UNDEF R6PLUS_CENTROIDS_AND_CENTERS R6PLUS_CENTERS_ONLY R6PLUS_CENTROIDS_ONLY R6PLUS_TRISTRIP R6PLUS_LINESTRIP R6PLUS_POINTLIST R6PLUS_SPRITE_EN R6PLUS_GS_SCENARIO_C R6PLUS_GS_SCENARIO_G R6PLUS_GS_SCENARIO_B R6PLUS_GS_SCENARIO_A R6PLUS_GS_OFF R6PLUS_TVX_FMT_RESERVED_63 R6PLUS_TVX_FMT_CTX1 R6PLUS_TVX_FMT_APC7 R6PLUS_TVX_FMT_APC6 R6PLUS_TVX_FMT_APC5 R6PLUS_TVX_FMT_APC4 R6PLUS_TVX_FMT_APC3 R6PLUS_TVX_FMT_APC2 R6PLUS_TVX_FMT_APC1 R6PLUS_TVX_FMT_APC0 R6PLUS_TVX_FMT_BC5 R6PLUS_TVX_FMT_BC4 R6PLUS_TVX_FMT_BC3 R6PLUS_TVX_FMT_BC2 R6PLUS_TVX_FMT_BC1 R6PLUS_TVX_FMT_32_32_32_FLOAT R6PLUS_TVX_FMT_32_32_32 R6PLUS_TVX_FMT_16_16_16_FLOAT R6PLUS_TVX_FMT_16_16_16 R6PLUS_TVX_FMT_8_8_8 R6PLUS_TVX_FMT_5_9_9_9_SHAREDEXP R6PLUS_TVX_FMT_32_AS_8_8 R6PLUS_TVX_FMT_32_AS_8 R6PLUS_TVX_FMT_BG_RG R6PLUS_TVX_FMT_GB_GR R6PLUS_TVX_FMT_1_REVERSED R6PLUS_TVX_FMT_1 R6PLUS_TVX_FMT_RESERVED_36 R6PLUS_TVX_FMT_32_32_32_32_FLOAT R6PLUS_TVX_FMT_32_32_32_32 R6PLUS_TVX_FMT_RESERVED_33 R6PLUS_TVX_FMT_16_16_16_16_FLOAT R6PLUS_TVX_FMT_16_16_16_16 R6PLUS_TVX_FMT_32_32_FLOAT R6PLUS_TVX_FMT_32_32 R6PLUS_TVX_FMT_X24_8_32_FLOAT R6PLUS_TVX_FMT_10_10_10_2 R6PLUS_TVX_FMT_8_8_8_8 R6PLUS_TVX_FMT_2_10_10_10 R6PLUS_TVX_FMT_11_11_10_FLOAT R6PLUS_TVX_FMT_11_11_10 R6PLUS_TVX_FMT_10_11_11_FLOAT R6PLUS_TVX_FMT_10_11_11 R6PLUS_TVX_FMT_24_8_FLOAT R6PLUS_TVX_FMT_24_8 R6PLUS_TVX_FMT_8_24_FLOAT R6PLUS_TVX_FMT_8_24 R6PLUS_TVX_FMT_16_16_FLOAT R6PLUS_TVX_FMT_16_16 R6PLUS_TVX_FMT_32_FLOAT R6PLUS_TVX_FMT_32 R6PLUS_TVX_FMT_5_5_5_1 R6PLUS_TVX_FMT_4_4_4_4 R6PLUS_TVX_FMT_1_5_5_5 R6PLUS_TVX_FMT_6_5_5 R6PLUS_TVX_FMT_5_6_5 R6PLUS_TVX_FMT_8_8 R6PLUS_TVX_FMT_16_FLOAT R6PLUS_TVX_FMT_16 R6PLUS_TVX_FMT_RESERVED_4 R6PLUS_TVX_FMT_3_3_2 R6PLUS_TVX_FMT_4_4 R6PLUS_TVX_FMT_8 R6PLUS_TVX_FMT_INVALID R6PLUS_TVX_DstSel_Mask R6PLUS_TVX_DstSel_RESERVED_6 R6PLUS_TVX_DstSel_1f R6PLUS_TVX_DstSel_0f R6PLUS_TVX_DstSel_W R6PLUS_TVX_DstSel_Z R6PLUS_TVX_DstSel_Y R6PLUS_TVX_DstSel_X R6PLUS_SRF_MODE_NO_ZERO R6PLUS_SRF_MODE_ZERO_CLAMP_MINUS_ONE R6PLUS_NUM_FORMAT_SCALED R6PLUS_NUM_FORMAT_INT R6PLUS_NUM_FORMAT_NORM R6PLUS_TVX_EndianSwap_RESERVED_3 R6PLUS_TVX_EndianSwap_8in32 R6PLUS_TVX_EndianSwap_8in16 R6PLUS_TVX_EndianSwap_None R6PLUS_DSR_MUX_DWORD_SELECT R6PLUS_DSR_MUX_FFT_PERMUTE R6PLUS_DSR_MUX_NONE ATOMIC_ORDERED_ALLOC_RET USHORT_READ_RET SHORT_READ_RET UBYTE_READ_RET BYTE_READ_RET READWRITE_RET READ2_RET READ_REL_RET READ_RET CMP_XCHG_SPF_RET CMP_XCHG_RET XCHG2_RET XCHG_REL_RET XCHG_RET MSKOR_RET XOR_RET OR_RET AND_RET MAX_UINT_RET MIN_UINT_RET MAX_INT_RET MIN_INT_RET DEC_RET INC_RET RSUB_RET SUB_RET ADD_RET SHORT_WRITE BYTE_WRITE CMP_STORE_SPF CMP_STORE WRITE2 WRITE_REL WRITE DEC_UINT_RTN INC_UINT_RTN MSKOR_RTN XOR_RTN OR_RTN AND_RTN MAX_UINT_RTN MAX_INT_RTN MIN_UINT_RTN MIN_INT_RTN RSUB_RTN SUB_RTN ADD_RTN CMPXCHG_FDENORM_RTN CMPXCHG_FLT_RTN CMPXCHG_INT_RTN XCHG_FDENORM_RTN XCHG_RTN NOP_RTN STORE_BYTE__NI STORE_SHORT__NI STORE_DWORD__NI DEC_UINT INC_UINT MSKOR RSUB CMPXCHG_FDENORM CMPXCHG_FLT CMPXCHG_INT STORE_RAW_FDENORM STORE_RAW STORE_TYPED R6PLUS_VTX_FETCH_NO_INDEX_OFFSET R6PLUS_VTX_FETCH_INSTANCE_DATA R6PLUS_VTX_FETCH_VERTEX_DATA R6PLUS_TEX_NORMALIZED R6PLUS_TEX_UNNORMALIZED R6PLUS_VTX_INST_GET_BUFFER_RESINFO R6PLUS_VTX_INST_MEM R6PLUS_VTX_INST_SEMANTIC R6PLUS_VTX_INST_FETCH R6PLUS_FORMAT_COMP_UNSIGNED_BIASED R6PLUS_FORMAT_COMP_SIGNED R6PLUS_FORMAT_COMP_UNSIGNED R6PLUS_ENDIAN_8IN32 R6PLUS_ENDIAN_8IN16 R6PLUS_ENDIAN_NONE R6PLUS_ALU_EXECUTE_MASK_OP_KILL R6PLUS_ALU_EXECUTE_MASK_OP_CONTINUE R6PLUS_ALU_EXECUTE_MASK_OP_BREAK R6PLUS_ALU_EXECUTE_MASK_OP_DEACTIVATE R6PLUS_ALU_SCL_221 R6PLUS_ALU_SCL_212 R6PLUS_ALU_SCL_122 R6PLUS_ALU_SCL_210 R6PLUS_ALU_VEC_210 R6PLUS_ALU_VEC_201 R6PLUS_ALU_VEC_102 R6PLUS_ALU_VEC_120 R6PLUS_ALU_VEC_021 R6PLUS_ALU_VEC_012 R6PLUS_ALU_OMOD_D2 R6PLUS_ALU_OMOD_M4 R6PLUS_ALU_OMOD_M2 R6PLUS_ALU_OMOD_OFF R6PLUS_PRED_SEL_ONE R6PLUS_PRED_SEL_ZERO R6PLUS_PRED_SEL_RESERVED_1 R6PLUS_PRED_SEL_OFF R6PLUS_CF_JUMPTABLE_SEL_INDEX_1 R6PLUS_CF_JUMPTABLE_SEL_INDEX_0 R6PLUS_CF_JUMPTABLE_SEL_CONST_D R6PLUS_CF_JUMPTABLE_SEL_CONST_C R6PLUS_CF_JUMPTABLE_SEL_CONST_B R6PLUS_CF_JUMPTABLE_SEL_CONST_A R6PLUS_CF_PIXEL_Z R6PLUS_CF_PIXEL_MRT7 R6PLUS_CF_PIXEL_MRT6 R6PLUS_CF_PIXEL_MRT5 R6PLUS_CF_PIXEL_MRT4 R6PLUS_CF_PIXEL_MRT3 R6PLUS_CF_PIXEL_MRT2 R6PLUS_CF_PIXEL_MRT1 R6PLUS_CF_PIXEL_MRT0 R6PLUS_CF_POS_3 R6PLUS_CF_POS_2 R6PLUS_CF_POS_1 R6PLUS_CF_POS_0 R6PLUS_CF_INVALID R6PLUS_CF_INDEX_1 R6PLUS_CF_INDEX_0 R6PLUS_CF_INDEX_NONE R6PLUS_CF_KCACHE_LOCK_LOOP_INDEX R6PLUS_CF_KCACHE_LOCK_2 R6PLUS_CF_KCACHE_LOCK_1 R6PLUS_CF_KCACHE_NOP R6PLUS_EXPORT_WRITE_IND_ACK R6PLUS_EXPORT_WRITE_ACK R6PLUS_EXPORT_WRITE_IND R6PLUS_EXPORT_WRITE R6PLUS_EXPORT_PARAM R6PLUS_EXPORT_POS R6PLUS_EXPORT_PIXEL R6PLUS_CF_COND_NOT_BOOL R6PLUS_CF_COND_BOOL R6PLUS_CF_COND_FALSE R6PLUS_CF_COND_ACTIVE R6PLUS_CF_ENCODING_INST_ALU1 R6PLUS_CF_ENCODING_INST_ALU0 R6PLUS_CF_ENCODING_INST_ALLOC_EXPORT R6PLUS_CF_ENCODING_INST_CF R6PLUS_REL_GLOBAL R6PLUS_REL_LOOP R6PLUS_REL_NONE R6PLUS_INDEX_GLOBAL_AR_X R6PLUS_INDEX_GLOBAL R6PLUS_INDEX_LOOP R6PLUS_INDEX_AR_W R6PLUS_INDEX_AR_Z R6PLUS_INDEX_AR_Y R6PLUS_INDEX_AR_X R6PLUS_RELATIVE R6PLUS_ABSOLUTE R6PLUS_SEL_MASK R6PLUS_SEL_RESERVED_6 R6PLUS_SEL_1 R6PLUS_SEL_0 R6PLUS_SEL_W R6PLUS_SEL_Z R6PLUS_SEL_Y R6PLUS_SEL_X R6PLUS_CHAN_W R6PLUS_CHAN_Z R6PLUS_CHAN_Y R6PLUS_CHAN_X R6PLUS_MEM_INST_MEM ;SQ_PGM_RESOURCES_2 = 0x%08X VGT_STRMOUT_CONFIG = 0x%x VGT_STRMOUT_CONFIG:RAST_STREAM = %u VGT_STRMOUT_CONFIG:STREAMOUT_0_EN = %u VGT_STRMOUT_CONFIG:STREAMOUT_1_EN = %u VGT_STRMOUT_CONFIG:STREAMOUT_2_EN = %u VGT_STRMOUT_CONFIG:STREAMOUT_3_EN = %u u32LsStride = %d u32HsNumInputCP = %d u32HsNumOutputCP = %d u32HsNumPatchConst = %d u32HsCPStride = %d u32HsNumThread = %d u32HsTessFactorStride= %d HsTessFactorBufferTFMajor = %d u32TsDomain = %d u32TsPartition = %d u32TsOutputPrimitive = %d f32TsMaxTessFactor = %g u32PrimIdExportSlot = %d ; UavRtnBufInfoTbl[%d] stride = %d isTypedUav = %d dataType = %d ; GlobalRtnBufSlot = 0x%X ; GlobalRtnBufSlotShort = 0x%X ; GlobalRtnBufSlotByte = 0x%X ; RatOpIsUsed = 0x%X ; RatAtomicOpIsUsed = 0x%X VGT_GS_INSTANCE_CNT = 0x%08X ; ENABLE = %u ; CNT = %u SQ_LDS_ALLOC_PS:SIZE = 0x%08X ; SPI_PS_IN_CONTROL_2 = 0x%08X SPI2:LINE_STIPPLE_TEX_ENA = %u SPI2:LINE_STIPPLE_TEX_ADDR = %u ; SPI_BARYC_CNTL = 0x%08X SPI_BARYC_CNTL:PERSP_CENTER_ENA = %u SPI_BARYC_CNTL:PERSP_CENTROID_ENA = %u SPI_BARYC_CNTL:PERSP_SAMPLE_ENA = %u SPI_BARYC_CNTL:PERSP_PULL_MODEL_ENA = %u SPI_BARYC_CNTL:LINEAR_CENTER_ENA = %u SPI_BARYC_CNTL:LINEAR_CENTROID_ENA = %u SPI_BARYC_CNTL:LINEAR_SAMPLE_ENA = %u DB:DB_SOURCE_FORMAT = %u DB:CONSERVATIVE_Z_EXPORT = %u DB:DEPTH_BEFORE_SHADER = %u DB:EXEC_ON_HIER_FAIL = %u DB:EXEC_ON_NOOP = %u SQ_LDS_ALLOC:SIZE = 0x%08X ; NumThreadPerGroupFlattened = %d ; NumThreadPerGroup_x = %d ; NumThreadPerGroup_y = %d ; NumThreadPerGroup_z = %d _addr _unroll _size _matrix _coordtype _aoffimmi _compselect _sampler _resource _relop _resourcetype _type _fmtw _fmtz _fmty _fmtx _nrm3 _zeroop dbg_temploc dbg_line dbg_string srv_struct_load_ext srv_raw_load_ext append_buf_consume append_buf_alloc uav_short_store uav_short_store_ext uav_byte_store uav_byte_store_ext uav_ushort_load uav_ushort_load_ext uav_ubyte_load uav_ubyte_load_ext uav_short_load uav_short_load_ext uav_byte_load uav_byte_load_ext uav_read_udec uav_read_udec_ext uav_read_uinc uav_read_uinc_ext uav_udec uav_udec_ext uav_uinc uav_uinc_ext uav_read_cmp_xchg uav_read_cmp_xchg_ext uav_read_xchg uav_read_xchg_ext uav_read_xor uav_read_xor_ext uav_read_or uav_read_or_ext uav_read_and uav_read_and_ext uav_read_umax uav_read_umax_ext uav_read_umin uav_read_umin_ext uav_read_max uav_read_max_ext uav_read_min uav_read_min_ext uav_read_rsub uav_read_rsub_ext uav_read_sub uav_read_sub_ext uav_read_add uav_read_add_ext uav_cmp uav_cmp_ext uav_xor uav_xor_ext uav_or uav_or_ext uav_and uav_and_ext uav_umax uav_umax_ext uav_umin uav_umin_ext uav_max uav_max_ext uav_min uav_min_ext uav_rsub uav_rsub_ext uav_sub uav_sub_ext uav_add uav_add_ext uav_arena_store uav_arena_load uav_struct_store uav_struct_store_ext uav_raw_store uav_raw_store_ext uav_store uav_store_ext uav_struct_load uav_struct_load_ext uav_raw_load uav_raw_load_ext uav_load uav_load_ext gds_read_cmp_xchg gds_read_xchg gds_read_mskor gds_read_xor gds_read_or gds_read_and gds_read_umax gds_read_umin gds_read_max gds_read_min gds_read_dec gds_read_inc gds_read_rsub gds_read_sub gds_read_add gds_cmp_store gds_mskor gds_xor gds_or gds_and gds_umax gds_umin gds_max gds_min gds_dec gds_inc gds_rsub gds_sub gds_add gds_store gds_load lds_store_short lds_store_byte lds_load_ushort lds_load_ubyte lds_load_short lds_load_byte lds_read_cmp_xchg lds_read_xchg lds_read_mskor lds_read_xor lds_read_or lds_read_and lds_read_umax lds_read_umin lds_read_max lds_read_min lds_read_rsub lds_read_sub lds_read_dec lds_read_inc lds_read_add lds_cmp lds_mskor lds_xor lds_or lds_and lds_umax lds_umin lds_max lds_min lds_rsub lds_sub lds_dec lds_inc lds_add lds_store_vec lds_load_vec lds_store lds_load mqsad qsad msad sad4 sadhi u4lerp bytealign bitalign unpack0 unpack1 unpack2 unpack3 f2u4 f_2_u4 mova mova_round invariant_mov invariant_move ftoi_flr ftoi_rpi utod itod dtou dtoi f162f f2f16_plus_inf f2f16_neg_inf f2f16_near f2f16 utof itof ftou ftoi dtrig_preop ddiv_fixup ddiv_fmas ddiv_scale dclass dmovc dmov drsq drcp dsqrt dmin dmax dmad dfrac ldexp dldexp ddiv dmul dadd dfrexp_mant dfrexp_exp frexp dfrexp fdiv_fixup fdiv_fmas fdiv_scale class max3 med3 min3 frexp_mant frexp_exp fldexp rcp_vec transpose sqrt_vec sqrt cos_vec sin_vec sincos rsq_vec round_z round_plusinf round_neginf round_nearest pireduce mmul logp log_vec fwidth faceforward expp exp_vec dxsincos dp2add dist colorclamp cmov_logical cmov clamp atan asin acos u64mod i64mod u64div i64div u64mul i64mul u64shr u64min u64max u64lt u64ge i64shr i64shl i64negate i64ne i64min i64max i64lt i64ge i64eq i64sub i64add umax3 umed3 umin3 imul24 umul24_high umul24 umad24 umul_high umul umin umax umad umod udiv ushr imax3 imed3 imin3 imad24 imul24_high iborrow icarry ishr ishl inegate imad imul_high imul imin imax iadd ubit_insert icbits ubit_reverser ubit_reverse ubit_extract ibit_extract inot ixor iand stream_id wave_id cu_id eval_centroid eval_sample_index eval_snapped emit_cut_sream emit_stream cut_stream sample_return_code sample_c_b_ext sample_c_b sample_c_l_ext sample_c_l sample_c_g_ext sample_c_g sample_c_ext sample_c sample_c_lz_ext sample_c_lz sample_l_ext sample_l sample_g_ext sample_g fetch4poc_ext fetch4poc fetch4c_ext fetch4c fetch4po_ext fetch4po fetch4_ext fetch4 sample_b_ext sample_b sample_ext sample samplepos_ext samplepos sampleinfo_ext sampleinfo bufinfo_ext bufinfo resinfo_ext resinfo getlod load_fptr_ext load_fptr load_ext load emitcut emit discard_logicalnz discard_logicalz dcl_resource dcl_global_flags dcl_stream dcl_num_instances dcl_lds_size_per_thread dclarray endphase hs_join_phase hs_fork_phase hs_cp_phase ret_logicalnz ret_logicalz ret_dyn switch whileloop loop_rep loop if_logicalnz if_logicalz ifnz func endloop endif endfunc endmain endswitch else default continue_logicalnz continue_logicalz continuec continue case call_logicalnz call_logicalz callnz call break_logicalnz break_logicalz breakc break none cubemaparray cubemap_plus_w 2d_plus_w 2darraymsaa 2darray 1darray buffer 2dmsaa cubemap No Error Non fragment programs not supported Invalid target architecture Unsupported program type Error in Source binary Error getting encoding count Couldn't find appropriate il binary source Invalid target Error Initializing compiler Memory allocation failure Error Creating program info Error Creating constants Error Creating UAV buffer Error encoding binary Error packing binary Invalid architecture Invalid machine type IL_SHADER_PIXEL IL_SHADER_COMPUTE ShaderType = %s TargetChip = %c Parse errors in converting assembly program No Error Reported! Fatal Error: Internal error encountered in back-end! Fatal Error: Back-end out of memory! Fatal Error: Invalid parameters passed to back-end! Fatal Error: Unsupported program construct detected in back-end! Fatal Error: Compilation error reported by back-end! Fatal Error: Invalid operation for this architecture Fatal Error: An unknown error occured in back-end! IL_DBG_TEMPLOC IL_DBG_LINE IL_DBG_STRING IL_OP_BFM IL_OP_BFI IL_OP_STREAM_ID IL_OP_MQSAD_U8 IL_OP_QSAD_U8 IL_OP_MSAD_U8 IL_OP_D_TRIG_PREOP IL_OP_D_DIV_FIXUP IL_OP_D_DIV_FMAS IL_OP_D_DIV_SCALE IL_OP_DIV_FIXUP IL_OP_DIV_FMAS IL_OP_DIV_SCALE IL_OP_SEMAPHORE_WAIT IL_OP_SEMAPHORE_SIGNAL IL_OP_SEMAPHORE_INIT IL_DCL_SEMAPHORE IL_DCL_GWS_THREAD_COUNT IL_OP_U64_MOD IL_OP_I64_MOD IL_OP_U64_DIV IL_OP_I64_DIV IL_OP_I64_SUB IL_OP_WAVE_ID IL_OP_CU_ID IL_OP_SAMPLE_RETURN_CODE IL_OP_D_CLASS IL_OP_CLASS IL_OP_U_MAX3 IL_OP_U_MED3 IL_OP_U_MIN3 IL_OP_I_MAX3 IL_OP_I_MED3 IL_OP_I_MIN3 IL_OP_MAX3 IL_OP_MED3 IL_OP_MIN3 IL_OP_FTOI_FLR IL_OP_FTOI_RPI IL_OP_UTOD IL_OP_ITOD IL_OP_DTOU IL_OP_DTOI IL_OP_D_FREXP_MANT IL_OP_D_FREXP_EXP IL_OP_FREXP_MANT IL_OP_FREXP_EXP IL_OP_LDEXP IL_OP_U64_MUL IL_OP_I64_MUL IL_OP_F_2_F16_PLUS_INF IL_OP_F_2_F16_NEG_INF IL_OP_F_2_F16_NEAR IL_OP_LDS_READ_MSKOR IL_OP_LDS_MSKOR IL_OP_LDS_READ_DEC IL_OP_LDS_READ_INC IL_OP_LDS_DEC IL_OP_LDS_INC IL_OP_U_MUL24_HIGH IL_OP_I_MUL24_HIGH IL_OP_DCL_TYPELESS_UAV IL_OP_DCL_TYPED_UAV IL_OP_U64_SHR IL_OP_U64_MIN IL_OP_U64_MAX IL_OP_U64_LT IL_OP_U64_GE IL_OP_I64_SHR IL_OP_I64_SHL IL_OP_I64_NEGATE IL_OP_I64_NE IL_OP_I64_MIN IL_OP_I64_MAX IL_OP_I64_LT IL_OP_I64_GE IL_OP_I64_EQ IL_OP_I64_ADD IL_OP_UAV_SHORT_STORE IL_OP_UAV_BYTE_STORE IL_OP_UAV_USHORT_LOAD IL_OP_UAV_UBYTE_LOAD IL_OP_UAV_SHORT_LOAD IL_OP_UAV_BYTE_LOAD IL_OP_LDS_STORE_SHORT IL_OP_LDS_STORE_BYTE IL_OP_LDS_LOAD_USHORT IL_OP_LDS_LOAD_UBYTE IL_OP_LDS_LOAD_SHORT IL_OP_LDS_LOAD_BYTE IL_OP_UAV_READ_UDEC IL_OP_UAV_READ_UINC IL_OP_I_MUL24 IL_OP_I_MAD24 IL_OP_UAV_UDEC IL_OP_UAV_UINC IL_OP_FMA IL_OP_U_MUL24 IL_OP_U_MAD24 IL_OP_GDS_READ_CMP_XCHG IL_OP_GDS_READ_XCHG IL_OP_GDS_READ_MSKOR IL_OP_GDS_READ_XOR IL_OP_GDS_READ_OR IL_OP_GDS_READ_AND IL_OP_GDS_READ_UMAX IL_OP_GDS_READ_UMIN IL_OP_GDS_READ_MAX IL_OP_GDS_READ_MIN IL_OP_GDS_READ_DEC IL_OP_GDS_READ_INC IL_OP_GDS_READ_RSUB IL_OP_GDS_READ_SUB IL_OP_GDS_READ_ADD IL_OP_GDS_CMP_STORE IL_OP_GDS_MSKOR IL_OP_GDS_XOR IL_OP_GDS_OR IL_OP_GDS_AND IL_OP_GDS_UMAX IL_OP_GDS_UMIN IL_OP_GDS_MAX IL_OP_GDS_MIN IL_OP_GDS_DEC IL_OP_GDS_INC IL_OP_GDS_RSUB IL_OP_GDS_SUB IL_OP_GDS_ADD IL_OP_GDS_STORE IL_OP_GDS_LOAD IL_DCL_STRUCT_GDS IL_DCL_GDS IL_OP_PREFIX IL_DCL_MAX_THREAD_PER_GROUP IL_OP_LOAD_FPTR IL_OP_RCP_VEC IL_DCL_GLOBAL_FLAGS IL_DCL_STREAM IL_OP_MACROCALL IL_OP_MACROEND IL_OP_MACRODEF IL_OP_D_RSQ IL_OP_D_RCP IL_OP_D_SQRT IL_OP_D_MOVC IL_OP_D_MOV IL_OP_EVAL_CENTROID IL_OP_EVAL_SAMPLE_INDEX IL_OP_EVAL_SNAPPED IL_OP_F_2_U4 IL_OP_SAD_4 IL_OP_SAD_HI IL_OP_SAD IL_OP_U4LERP IL_OP_BYTE_ALIGN IL_OP_BIT_ALIGN IL_OP_UNPACK3 IL_OP_UNPACK2 IL_OP_UNPACK1 IL_OP_UNPACK0 IL_OP_F16_2_F IL_OP_F_2_F16 IL_OP_DMIN IL_OP_DMAX IL_OP_FETCH4_PO_C IL_OP_FETCH4_PO IL_OP_FETCH4_C IL_OP_BUFINFO IL_OP_U_BIT_INSERT IL_OP_FCALL IL_OP_DCL_INTERFACE_PTR IL_OP_DCL_FUNCTION_TABLE IL_OP_DCL_FUNCTION_BODY IL_DCL_MAX_TESSFACTOR IL_DCL_TS_OUTPUT_PRIMITIVE IL_DCL_TS_PARTITION IL_DCL_TS_DOMAIN IL_OP_ENDPHASE IL_OP_HS_JOIN_PHASE IL_OP_HS_FORK_PHASE IL_OP_HS_CP_PHASE IL_DCL_NUM_INSTANCES IL_DCL_NUM_OCP IL_DCL_NUM_ICP IL_OP_U_BIT_REVERSE IL_OP_U_BIT_EXTRACT IL_OP_I_BIT_EXTRACT IL_OP_I_BORROW IL_OP_I_CARRY IL_OP_I_FIRSTBIT IL_OP_I_COUNTBITS IL_OP_SAMPLE_C_B IL_OP_SAMPLE_C_G IL_OP_SAMPLE_C_L IL_OP_EMIT_THEN_CUT_STREAM IL_OP_EMIT_STREAM IL_OP_CUT_STREAM IL_OP_LDS_READ_CMP_XCHG IL_OP_LDS_READ_XCHG IL_OP_LDS_READ_XOR IL_OP_LDS_READ_OR IL_OP_LDS_READ_AND IL_OP_LDS_READ_UMAX IL_OP_LDS_READ_UMIN IL_OP_LDS_READ_MAX IL_OP_LDS_READ_MIN IL_OP_LDS_READ_RSUB IL_OP_LDS_READ_SUB IL_OP_LDS_READ_ADD IL_OP_LDS_CMP IL_OP_LDS_XOR IL_OP_LDS_OR IL_OP_LDS_AND IL_OP_LDS_UMAX IL_OP_LDS_UMIN IL_OP_LDS_MAX IL_OP_LDS_MIN IL_OP_LDS_RSUB IL_OP_LDS_SUB IL_OP_LDS_ADD IL_OP_LDS_STORE IL_OP_LDS_LOAD IL_DCL_STRUCT_LDS IL_DCL_LDS IL_OP_SRV_STRUCT_LOAD IL_OP_SRV_RAW_LOAD IL_OP_DCL_STRUCT_SRV IL_OP_DCL_RAW_SRV IL_OP_APPEND_BUF_CONSUME IL_OP_APPEND_BUF_ALLOC IL_OP_UAV_READ_CMP_XCHG IL_OP_UAV_READ_XCHG IL_OP_UAV_READ_XOR IL_OP_UAV_READ_OR IL_OP_UAV_READ_AND IL_OP_UAV_READ_UMAX IL_OP_UAV_READ_UMIN IL_OP_UAV_READ_MAX IL_OP_UAV_READ_MIN IL_OP_UAV_READ_RSUB IL_OP_UAV_READ_SUB IL_OP_UAV_READ_ADD IL_OP_UAV_CMP IL_OP_UAV_XOR IL_OP_UAV_OR IL_OP_UAV_AND IL_OP_UAV_UMAX IL_OP_UAV_UMIN IL_OP_UAV_MAX IL_OP_UAV_MIN IL_OP_UAV_RSUB IL_OP_UAV_SUB IL_OP_UAV_ADD IL_OP_UAV_ARENA_STORE IL_OP_UAV_ARENA_LOAD IL_OP_DCL_ARENA_UAV IL_OP_UAV_STRUCT_STORE IL_OP_UAV_RAW_STORE IL_OP_UAV_STORE IL_OP_UAV_STRUCT_LOAD IL_OP_UAV_RAW_LOAD IL_OP_UAV_LOAD IL_OP_DCL_STRUCT_UAV IL_OP_DCL_RAW_UAV IL_OP_DCL_UAV IL_OP_LDS_STORE_VEC IL_OP_LDS_LOAD_VEC IL_OP_FENCE IL_OP_LDS_WRITE_VEC IL_OP_LDS_READ_VEC IL_OP_DCL_LDS_SHARING_MODE IL_OP_DCL_LDS_SIZE_PER_THREAD IL_OP_DCL_TOTAL_NUM_THREAD_GROUP IL_OP_DCL_NUM_THREAD_PER_GROUP IL_OP_INIT_SR_HELPER IL_OP_INIT_SR IL_OP_DCL_SHARED_TEMP IL_OP_D_DIV IL_OP_SAMPLEPOS IL_OP_DLT IL_OP_DGE IL_OP_DEQ IL_OP_DNE IL_DCL_PERSIST IL_OP_GETLOD IL_OP_SAMPLEINFO IL_OP_FETCH4 IL_OP_D_MULADD IL_OP_D_FRAC IL_OP_D_LDEXP IL_OP_F_2_D IL_OP_D_2_F IL_OP_D_MUL IL_OP_D_ADD IL_OP_D_FREXP IL_OP_SCATTER IL_OP_INV_MOV IL_OP_DP2 IL_OP_SQRT_VEC IL_OP_COS_VEC IL_OP_SIN_VEC IL_OP_RSQ_VEC IL_OP_ROUND_ZERO IL_OP_ROUND_PLUS_INF IL_OP_ROUND_NEG_INF IL_OP_ROUND_NEAR IL_OP_NE IL_OP_LT IL_OP_LOG_VEC IL_OP_GE IL_OP_EXP_VEC IL_OP_EQ IL_OP_CMOV_LOGICAL IL_OP_AND IL_OP_UTOF IL_OP_ITOF IL_OP_FTOU IL_OP_FTOI IL_OP_U_MUL_HIGH IL_OP_U_MUL IL_OP_U_GE IL_OP_U_LT IL_OP_U_MIN IL_OP_U_MAX IL_OP_U_MAD IL_OP_U_MOD IL_OP_U_DIV IL_OP_U_SHR IL_OP_I_SHR IL_OP_I_SHL IL_OP_I_NE IL_OP_I_NEGATE IL_OP_I_LT IL_OP_I_GE IL_OP_I_EQ IL_OP_I_MUL_HIGH IL_OP_I_MUL IL_OP_I_MIN IL_OP_I_MAX IL_OP_I_MAD IL_OP_I_ADD IL_OP_I_XOR IL_OP_I_OR IL_OP_I_NOT IL_OP_SAMPLE_C_LZ IL_OP_SAMPLE_C IL_OP_SAMPLE_L IL_OP_SAMPLE_G IL_OP_SAMPLE_B IL_OP_SAMPLE IL_OP_RESINFO IL_OP_LOAD IL_OP_EMIT_THEN_CUT IL_OP_EMIT IL_OP_DISCARD_LOGICALNZ IL_OP_DISCARD_LOGICALZ IL_OP_CUT IL_DCL_RESOURCE IL_DCL_VPRIM IL_DCL_INPUT IL_DCL_OUTPUT IL_DCL_OUTPUT_TOPOLOGY IL_DCL_ODEPTH IL_DCL_MAX_OUTPUT_VERTEX_COUNT IL_DCL_LITERAL IL_DCL_INPUT_PRIMITIVE IL_DCL_INDEXED_TEMP_ARRAY IL_DCL_CONST_BUFFER IL_OP_RET_LOGICALNZ IL_OP_RET_LOGICALZ IL_OP_RET_DYN IL_OP_SWITCH IL_OP_WHILE IL_OP_IF_LOGICALNZ IL_OP_IF_LOGICALZ IL_OP_ENDINLINEFUNC IL_OP_ENDSWITCH IL_OP_DEFAULT IL_OP_CONTINUE_LOGICALNZ IL_OP_CONTINUE_LOGICALZ IL_OP_CASE IL_OP_CALL_LOGICALNZ IL_OP_CALL_LOGICALZ IL_OP_BREAK_LOGICALNZ IL_OP_BREAK_LOGICALZ IL_OP_DXSINCOS IL_OP_TRC IL_OP_TRANSPOSE IL_OP_TEXWEIGHT IL_OP_TEXLDMS IL_OP_TEXLDD IL_OP_TEXLDB IL_OP_TEXLD IL_OP_TAN IL_OP_SUB IL_OP_SQRT IL_OP_SINCOS IL_OP_SIN IL_OP_SGN IL_OP_SET IL_OP_RSQ IL_OP_RND IL_OP_RET IL_OP_REFLECT IL_OP_RCP IL_OP_PROJECT IL_OP_PRECOMP IL_OP_POW IL_OP_PIREDUCE IL_OP_NRM IL_OP_NOP IL_OP_NOISE IL_OP_MUL IL_OP_MOVA IL_OP_MOV IL_OP_MOD IL_OP_MMUL IL_OP_MIN IL_OP_MEMIMPORT IL_OP_MEMEXPORT IL_OP_MAX IL_OP_MAD IL_OP_LRP IL_OP_LOOP IL_OP_LOGP IL_OP_LOG IL_OP_LOD IL_OP_LN IL_OP_LIT IL_OP_LEN IL_OP_KILL IL_OP_INITV IL_OP_IFNZ IL_OP_IFC IL_OP_FWIDTH IL_OP_FUNC IL_OP_FRC IL_OP_FLR IL_OP_FACEFORWARD IL_OP_EXPP IL_OP_EXP IL_OP_EXN IL_OP_ENDMAIN IL_OP_ENDLOOP IL_OP_ENDIF IL_OP_END IL_OP_ELSE IL_OP_DSY IL_OP_DSX IL_OP_DST IL_OP_DP4 IL_OP_DP3 IL_OP_DP2ADD IL_OP_DIV IL_OP_DIST IL_OP_DET IL_OP_DEFB IL_OP_DEF IL_OP_DCLVOUT IL_OP_DCLV IL_OP_DCLPT IL_OP_DCLPP IL_OP_DCLPIN IL_OP_DCLPI IL_OP_DCLDEF IL_OP_DCLARRAY IL_OP_CRS IL_OP_COS IL_OP_CONTINUEC IL_OP_CONTINUE IL_OP_COMMENT IL_OP_COLORCLAMP IL_OP_CMP IL_OP_CMOV IL_OP_CLG IL_OP_CLAMP IL_OP_CALLNZ IL_OP_CALL IL_OP_BREAKC IL_OP_BREAK IL_OP_ATAN IL_OP_ASIN IL_OP_ADD IL_OP_ACOS IL_OP_ABS IL_OP_UNKNOWN ILScanILBinary: Unknown opcode in IL Binary ILScanILBinary: Unsupported opcode for architecture ILScanILBinary: Unsupported opcode ILScanILBinary: Fatal Error: Non constant buffer constant detected param IsMaxNumWavePerSIMD NumWavefrontPerSIMD TotalNumThreadGroup NumThreadPerGroup Slow Fast CsSetupMode NumSharedGprTotal NumSharedGprUser SCENARIO_G SCENARIO_B SCENARIO_A GS_MODE MemExportSize writeMask outputSlot memOffset index STREAM CULL_DIST_ENA7 CULL_DIST_ENA6 CULL_DIST_ENA5 CULL_DIST_ENA4 CULL_DIST_ENA3 CULL_DIST_ENA2 CULL_DIST_ENA1 CULL_DIST_ENA0 CLIP_DIST_ENA7 CLIP_DIST_ENA6 CLIP_DIST_ENA5 CLIP_DIST_ENA4 CLIP_DIST_ENA3 CLIP_DIST_ENA2 CLIP_DIST_ENA1 CLIP_DIST_ENA0 MergeFlags VS_OUT_CCDIST1_VEC_ENA VS_OUT_CCDIST0_VEC_ENA VS_OUT_MISC_VEC_ENA USE_VTX_KILL_FLAG USE_VTX_VIEWPORT_INDX USE_VTX_RENDER_TARGET_INDX USE_VTX_EDGE_FLAG USE_VTX_POINT_SIZE PA_CL_VS_OUT_CNTL R600VSOUTPUT_USE_BEST_MODE R600VSOUTPUT_VECTOR_SEMANTICS R600VSOUTPUT_COMPONENT_SEMANTICS VsOutSemanticMode VS_EXPORT_COUNT SLOT StreamOutStride StreamOutDecls StreamOutEnable UsesPrimId MaxOutputVertexCount MemExportVertexSize VGT_GS_OUT_PRIM_TYPE SampleFreq MaxReductionBufferSize CB_SHADER_CONTROL:bitmap DB:KILL_ENABLE DB:Z_ORDER DB:ALPHA_TO_MASK_DISABLE DB:MASK_EXPORT_ENABLE DB:STENCIL_REF_EXPORT_ENABLE DB:Z_EXPORT_ENABLE SPI:PROVIDE_Z_TO_SPI SPI0:BARYC_SAMPLE_ENA SPI0:POSITION_SAMPLE SPI0:LINEAR_GRADIENT_ENA SPI0:PERSP_GRADIENT_ENA SPI0:BARYC_SAMPLE_CNTL SPI0:PARAM_GEN_ADDR SPI0:PARAM_GEN SPI0:POSITION_ADDR SPI0:POSITION_CENTROID SPI0:POSITION_ENA NumTexStages SPI0:NUM_INTERP TexCubeMaskBits SPI1:GEN_INDEX_PIX_ADDR SPI1:FOG_ADDR SPI1:FRONT_FACE_CHAN SPI1:FRONT_FACE_ADDR SPI1:FRONT_FACE_ENA SPI1:FIXED_PT_POSITION_ADDR SPI1:FIXED_PT_POSITION_ENA SPI1:GEN_INDEX_PIX SQ_PGM_EXPORTS_PS:PS_EXPORT_MODE GprPoolSize MaxScratchRegsNeeded SQ_PRM_RESOURCES:PRIME_CACHE_ENABLE SQ_PRM_RESOURCES:FETCH_CACHE_LINES SQ_PGM_RESOURCES:STACK_SIZE PGM_END_FETCH PGM_END_ALU PGM_END_CF SQ_PGM_END_FETCH SQ_PGM_END_ALU SQ_PGM_END_CF CodeLen SQ_PGM_RESOURCES:NUM_GPRS NumClauseTemps NumIntrlBConstants NumIntrlIConstants NumIntrlFConstants original IL_Unknown ResourcesAffectAlphaOutput fatal flex scanner internal error--no action found fatal flex scanner internal error--end of buffer missed input buffer overflow, can't enlarge buffer because scanner uses REJECT input in flex scanner failed out of dynamic memory in yy_create_buffer() out of dynamic memory in yy_scan_buffer() out of dynamic memory in yy_scan_bytes() bad buffer in yy_scan_bytes() out of memory expanding start-condition stack start-condition stack underflow RegSel FloatComment Register RelMode VecReg VecDstWriteMask RegisterAbs RegisterNeg SrcReg SrcRegList DestReg ALUOpcode0 ALUOpcode ALUProperty ALUProperties2 ALUProperties OutputMod ScalarOp ScalarOps ALUInstBlock ALUInst ALUClause TexParam TexParams TexParamsOpt TexInst TexOpcode VtxFetchDst VtxFetchPropOpt VtxFetchPropsOpt2 VtxFetchPropsOpt VtxFetchOpcode VtxFetchConstOpt VtxInst VtxClause TexClause VecSwiz1 PastSwizzle ExpectSwizzle VecSwiz cache CFPropertiesOpt CFPropListOpt2 CFPropListOpt CFExportInst CFTexInst CFVtxInst CFALUInst CFLoopInst CFInst vecptr1 cfmem vec_ptr cnd_kind CFClauseInst CFInstruction CFProgram HeaderItem PinPropOpt PinPropListOpt2 PinPropListOpt FooterItem FooterList HeaderList StartCopy CopyShader SHDissassembly VTX_FETCHTYPE VTX_WHOLE_QUAD VTX_NUM_FORMAT VTX_CONST_BUF VTX_SRF_MODE VTX_FORMAT_COMP VTX_ENDIAN_SWAP REG_INDEX VTX_FORMAT VTX_OFFSET FLOAT_SPECIAL FLOAT_LITERAL SR_REG L_BANK_SWIZZLE PS_REG FETCH_CONST SEM_ID PV_REG INT_REG CFILE_REG GPR_REG INTEGER_LITERAL L_COUNT CF_POPCNT CF_INST_BREAK CF_INST_ENDREP CF_INST_REP CF_INST_CALL CF_EXPORT_ESIZE CF_EXPORT_BRSTCNT SCALAR_ASSIGNMENT COORD_TYPE L_SAMPLER_ID L_RESOURCE_ID VTX_OPCODE TEX_OPCODE_NO_SRC TEX_OPCODE CHAN ALU_OPCODE0 ALU_OPCODE CF_EXPORT CF_CMD_IND CF_CMD CF_MEM CF_VTX CF_TEX CF_ALU L_CF_INST CF_JUMP L_CF_CONST CF_POP L_CALL_COUNT L_USES_WATERFALL CND_KIND1 L_ZOFFSET L_YOFFSET L_XOFFSET OMOD_D2 OMOD_M4 OMOD_M2 MINIFETCH MEGAFETCH L_VALID_PIXEL_MODE L_WHOLE_QUAD_MODE L_END_OF_PROGRAM L_FOGMERGE WRITE_MASK_INVERT L_UPDATE_PRED UPDATE_EXEC_MASK ALU_CLAMP L_KCACHE L_CB NO_BARRIER VEC_SWIZ EXPORT_REG CF_ADDR CF_COUNT L_KC L_VEC_PTR L_ARRAY_SIZE L_LINEAR L_CENTROID L_SAMPLE L_FLAT L_DEFAULT_VAL L_Usage V_REG L_RES_AFFECT_ALPHA L_CHAR L_TARGET_CHIP L_SHADER_TYPE L_IN L_ENABLE L_CBOUTPUT L_STREAM_STRIDE L_WRITE L_OUTPUTSLOT L_MEMOFFSET L_INDEX L_STREAM L_EsrcTypeCB L_EsrcLoop L_EsrcType_int_const L_START_COPY_SHADER L_VOUT L_ORIGINAL L_VIN L_CHDR L_DEP L_PHDR L_VHDR L_GHDR $undefined. error Miss expected %x got %x Starting parse Entering state %d Reading a token: Now at end of input. Next token is %d (%s Shifting token %d (%s), Reducing via rule %d (line %d), -> %s Special constant %f not supported! state stack now parse error Discarding token %d (%s). Error: state stack now Shifting error token, parser stack overflow Error: R600Asm(%d): parse error xVgi `Wgi REGTYPE_UNSET STENCIL_OP SAMPLE_RETURN_CODE LINE_STIPPLE TIMER NEW_PRIM_MASK_PIXEL NEW_PRIM_MASK_PIXQUAD LDS_PARAM_BASE Coverage_Mask EZGE EZLE LOAD_STORE_OFFSET THIS GS_INSTANCE_ID PHASE_IID OCP_ID DOMAIN BARY_COORD LDS_PARAM LDS_Q AC_MASK SIMD_ID RBUF CF_INDEX TF_BUF T_BIDF T_BID A_TIDF A_TID I_TIDF I_TID EPSFOG OMSK kc_al indexed_cb Call_RSC PRED M_RSC P_RSC IC_RSC ADDR LOOP PIVO PRIMT PRIMC GRAD SPRITE FACE INFOG INC1 INC0 INTEX QUAD PRIM BARY_HOS p=!i cb_flt-cb_flt cb_int-cb_int cb_int-lit cb_flt cb_int loop_bound65k loop_bound255 lod_bias kernel_size boolean_set constbuf_handle memexp0 memexp1 memexp2 memexp3 vertex fog factor adj shadow_fail viewport_z_far_plus_near viewport_bias_y viewport_bias_x viewport_z_far_minus_near viewport_scale_height_half viewport_scale_width_half tex_height_inv tex_width_inv src_bool src_float src_int bool 4.3f 0.5, 0.5, 0.5, 0.5 1.0, 1.0, 1.0, 1.0 0.0, 0.0, 0.0, 0.0 BARRIER 5%i SIMPLE ENTRY EXIT LOOP_FOOTER POST_LOOP_FOOTER IF_HEADER IF_HEADER_S IF_FOOTER IF_FOOTER_S JUMP_TABLE JUMP_TABLE_FOOTER BREAK CONTINUE INLINE_FUNC_END CALL_BLOCK REP_HEADER LOOP_HEADER 5%i 5%i 5%i 5%i 5%i 5%i 5%i 5%i

                            • Can I use BFI_INT directly from IL ?
                              Alice_Sunny
                              Wow. so interesting~
                              • Can I use BFI_INT directly from IL ?
                                MicahVillmow
                                Thanks for the feedback. I will forward your request to the proper people.
                                • Can I use BFI_INT directly from IL ?
                                  MicahVillmow
                                  As far as I know, this instruction support should be in 11.9, maybe even in 11.8, but the IL doc won't be updated until SDK 2.6 timeframe.

                                  Also, the optimization should be enabled in SDK 2.6.
                                    • Can I use BFI_INT directly from IL ?
                                      gat3way

                                      Yes, it is present in 11.9, thanks :)

                                      BTW as a side question, there is a BFE_UINT optimization now which is basically good but for some reason it generates an additional MOV instruction when it indexes an element in local memory, e.g:

                                      a = tableinlocalmemory[(X>> 2)&0x3f];

                                      would generate BFE_UINT and MOV. It's slower than what we had before that optimization was implemented, where we had just shr+and instead of bfe_uint + mov. 

                                      I could write a simplified test-case, but I guess it's not hard to reproduce.

                                      • Can I use BFI_INT directly from IL ?
                                        hashman

                                         

                                        Originally posted by: MicahVillmow As far as I know, this instruction support should be in 11.9, maybe even in 11.8, but the IL doc won't be updated until SDK 2.6 timeframe. Also, the optimization should be enabled in SDK 2.6.


                                        Well that is good to hear.  I will do some research.  

                                         

                                        Just to clarify when you say "the optimization should be enabled in SDK 2.6" you mean BFI_INT will be accessible via OpenCL code?  Via OpenCL bitselect() function?  Via an extension in cl_amd_media_ops?

                                        If so then that is even better news.  BFI_INT is an amazingly powerful function with many cryptographic uses.  For AMD perspective it is also one area that is a competitive advantage over NVidia as they have no equivelent function which results in simlar code taking 3 OPS vs 1 OP.

                                      • Can I use BFI_INT directly from IL ?
                                        MicahVillmow
                                        The instruction is called BFI in IL. I've been working on optimizing for certain patterns.
                                        One is (A & C) | (B & ~C)
                                        Another is (A & C) | (B & (C ^ -1))

                                        Those are the ones that I am confident will make it in 2.6. I got pulled onto some more pressing matters so I can't add any more patterns at this time.
                                        Another pattern that I was working on is (C ^ (A & (B ^ C)).

                                        I've notified the library person to see if he has time to take advantage of BFI for 2.6.
                                        [Update] 2.6 will use BFI for bitselect.
                                          • Can I use BFI_INT directly from IL ?
                                            corry

                                            So what am I doing wrong then that would be having the compiler return parse error near b when I put bfi in my code? 

                                            When I go into Kernel Analyzer options, it has an option for CAL version, "Use Latest Available (CAL 11.7) - v.157.2913 is selected.  Is kernel analyzer just not picking up the latest for some reason?  I tried it in my code anyways as all of the following just hoping to guess...

                                            bfi r0, r1, r2
                                            bfi r0, r1, r2, r3
                                            ibit_bfi r0, r1, r2
                                            ibit_bfi r0, r1, r2, r3
                                            ubit_bfi r0, r1, r2
                                            ubit_bfi r0, r1, r2, r3

                                            Nothing worked :(  I tried an uninstall of all AMD software from the system, and a reinstall, same version shows up in KernelAnalyzer.  Do I need to manually remove?  What am I doing wrong? 

                                            Again, as usual, thanks for helping us low level people out as well!

                                              • Can I use BFI_INT directly from IL ?
                                                hashman

                                                 

                                                Originally posted by: corry So what am I doing wrong then that would be having the compiler return parse error near b when I put bfi in my code? 

                                                 

                                                When I go into Kernel Analyzer options, it has an option for CAL version, "Use Latest Available (CAL 11.7) - v.157.2913 is selected.  Is kernel analyzer just not picking up the latest for some reason?  I tried it in my code anyways as all of the following just hoping to guess...



                                                The issue seems to be CAL 11.7.  The latest CAL is 11.9.  Not sure why KA thinks latest version is 11.7.  Technically KA is correct.  There is not BFI_INT in 11.7 and thus can't compile.

                                                My KA also only shows 11.7 = "latest".  I tried downloading and reinstalling SDK 2.5 with same outcome.  Is CAL support in Kernel Analyzer hard coded to CAL @ time of SDK release? Since the latest CAL when SKD 2.5 was released was 11.7 it will only see 11.7?

                                            • Can I use BFI_INT directly from IL ?
                                              MicahVillmow
                                              This instruction is not in CAL 11.7. You need to make sure that you are using CAL 11.9 from Catalyst 11.9.
                                                • Can I use BFI_INT directly from IL ?
                                                  corry

                                                  So I take it then I have to wait for 2.6 for KernelAnalyzer to be able to make use of it then?  Seems it uses dlls in C:\Program Files (x86)\Common Files\AMD\GPU ShaderAnalyzer called GPUShaderAnalyzer_CAL_11_7.dll, GPUShaderAnalyzer_CAL_11_6.dll, etc 

                                                  I fired up depends, and it never showed KernelAnalyzer loading aticalrt.dll, so I take it those other dlls are it.  I checked, they aren't simply renamed aitcalrt.dll files, so yeah, given theres no 11_8 or 11_9 on my system, I take it I'm SOL there.

                                                  My program, however, loads aticalrt and aticalcl, which in the SysWOW64 and system32 directories are listed as version 6.14.10.1546, but details also show it was compiled 9/8/2011 at 1:09pm

                                                  Thats all I have...whats wrong?!should I uninstall, delete everything ati/amd that I can find, kill all registry entries with amd or ati and reinstall?  Or is there just some other seperate download I am missing?

                                                • Can I use BFI_INT directly from IL ?
                                                  MicahVillmow
                                                  corry,
                                                  Please post a query here: http://forums.amd.com/forum/ca...m?catid=347&zb=4687012 and maybe the Dev tools team can help.
                                                    • Can I use BFI_INT directly from IL ?
                                                      corry

                                                      I can see posting there for kernelanalyzer, but I seem to be using the latest aticalcl compiler dll, and still cannot compile with bfi included in the source.  Seems to be 2 seperate issues.  The ISA Docs seem to say there should be 3 source operands, and 1 dest.  So seems bfi r0, r1, r2, r3 should work, yet, all I get is the standard useless annoying "Failed to compile program with IL front-end compiler" ...I'm uninstalling, and manually deleting in the hopes of fixing this...Interesting to note, after uninstall, C:\windows\system32\aticaltrt64.dll still exists...hmmm...

                                                        • Can I use BFI_INT directly from IL ?
                                                          corry

                                                          I completly uninstalled, deleted those files, installed catalyst 11.9, and verified the associated .dlls version numbers.  They were identical to what I had before.

                                                          Has anyone seen bfi in their IL code, and had the cal compiler accept it, or is this another case of it works on our internal versions? 

                                                          On a seperate (but sorta related) note, is there any way to get the error messages like what SKA gives from CAL, or do they get it from ILAssembler.dll?  I'd sure like to know what the complaint is, something better than fatal error:  failed to compile....That always leaves me thinking, "Gee, really?  I think I figured out that part already!" 

                                                            • Can I use BFI_INT directly from IL ?
                                                              corry

                                                              Micah/gat3way, I am going to go ahead and call shenanigans on this.  I just treid from OpenCL.  I'll post the kernels and you tell me why I see no BFI (unless like I said this is some sort of shenanigans....)

                                                               

                                                              //OpenCL Kernel Below.... //tried with and without this... //#pragma OPENCL EXTENSION cl_amd_media_ops : enable __kernel void Junk(__global unsigned int * output, __global unsigned int * input, const unsigned int multiplier) { uint tid = get_global_id(0); __global uint* mySpot=tid*8; uint t1, t2, t3, t4, t5; t1=input[3]; t2=input[8]; t2=input[14]; t5=(t1 & t2) | (t3 & ~t2) ; t4=bitselect(t1, t2, t3); mySpot[tid] = t4; mySpot[tid+1]=t5; } //Resulting IL below.... mdef(16383)_out(1)_in(2) mov r0, in0 mov r1, in1 div_zeroop(infinity) r0.x___, r0.x, r1.x mov out0, r0 mend il_cs_2_0 dcl_cb cb0[10] ; Constant buffer that holds ABI data dcl_literal l0, 4, 1, 2, 3 dcl_literal l1, 0x00FFFFFF, -1, -2, -3 dcl_literal l2, 0x0000FFFF, 0xFFFFFFFE,0x000000FF,0xFFFFFFFC dcl_literal l3, 24, 16, 8, 0xFFFFFFFF dcl_literal l4, 0xFFFFFF00, 0xFFFF0000, 0xFF00FFFF, 0xFFFF00FF dcl_literal l5, 0, 4, 8, 12 dcl_literal l6, 32, 32, 32, 32 dcl_literal l7, 24, 31, 16, 31 call 1024;$ endmain func 1024 ; __OpenCL_Junk_kernel mov r1013, cb0[8].x mov r1019, l1.0 dcl_max_thread_per_group 256 dcl_raw_uav_id(11) dcl_arena_uav_id(8) mov r0.z, vThreadGrpIdFlat.x mov r1022.xyz0, vTidInGrp.xyz mov r1023.xyz0, vThreadGrpId.xyz imad r1021.xyz0, r1023.xyz0, cb0[1].xyz0, r1022.xyz0 iadd r1021.xyz0, r1021.xyz0, cb0[6].xyz0 iadd r1023.xyz0, r1023.xyz0, cb0[7].xyz0 mov r1023.w, r0.z ishl r1023.w, r1023.w, l0.z mov r1018.x, l0.0 udiv r1024.xyz, r1021.xyz, cb0[10].xyz imad r1025.xyz, r1023.xyz, cb0[1].xyz, r1022.xyz dcl_literal l9, 0x00000002, 0x00000002, 0x00000002, 0x00000002; f32:i32 2 dcl_literal l10, 0x00000003, 0x00000003, 0x00000003, 0x00000003; f32:i32 3 dcl_literal l13, 0x00000004, 0x00000004, 0x00000004, 0x00000004; f32:i32 4 dcl_literal l12, 0x0000000c, 0x0000000c, 0x0000000c, 0x0000000c; f32:i32 12 dcl_literal l11, 0x00000038, 0x00000038, 0x00000038, 0x00000038; f32:i32 56 dcl_cb cb1[3] ; Kernel arg setup: output mov r1, cb1[0] ; Kernel arg setup: input mov r2, cb1[1] ; Kernel arg setup: multiplier mov r3, cb1[2] call 1027 ; Junk ret endfunc ; __OpenCL_Junk_kernel ;ARGSTART:__OpenCL_Junk_kernel ;version:2:0:74 ;device:cayman ;uniqueid:1024 ;memory:hwprivate:0 ;memory:hwregion:0 ;memory:hwlocal:0 ;pointer:output:i32:1:1:0:uav:8:8 ;pointer:input:i32:1:1:16:uav:11:8 ;value:multiplier:i32:1:1:32 ;function:1:1027 ;uavid:11 ;ARGEND:__OpenCL_Junk_kernel func 1027 ; Junk ; @__OpenCL_Junk_kernel ; BB#0: ; %entry mov r254, r1021.xyz0 mov r254, r254.x000 mov r255, l9.xxxx ishl r255.x___, r254.xxxx, r255.xxxx mov r256, l10.xxxx ishl r254.x___, r254.xxxx, r256.xxxx iadd r254.x___, r254.xxxx, r255.xxxx mov r255, l11.xxxx iadd r255.x___, r2.xxxx, r255.xxxx mov r1010.x___, r255.xxxx uav_raw_load_id(11)_cached r1011.x___, r1010.xxxx mov r255.x___, r1011.xxxx mov r256, l12.xxxx iadd r253.x___, r2.xxxx, r256.xxxx mov r1010.x___, r253.xxxx uav_raw_load_id(11)_cached r1011.x___, r1010.xxxx mov r253.x___, r1011.xxxx mov r1011.x___, r253.xxxx mov r1010.x___, r254.xxxx uav_arena_store_id(8)_size(dword) r1010.x, r1011.x iand r253.x___, r255.xxxx, r253.xxxx mov r255, l13.xxxx iadd r254.x___, r254.xxxx, r255.xxxx mov r1011.x___, r253.xxxx mov r1010.x___, r254.xxxx uav_arena_store_id(8)_size(dword) r1010.x, r1011.x ret endfunc ; Junk ;ARGSTART:Junk ;uniqueid:1027 ;memory:hwregion:0 ;memory:hwlocal:0 ;ARGEND:Junk end //Resulting isa below.... ShaderType = IL_SHADER_COMPUTE TargetChip = c ; ------------- SC_SRCSHADER Dump ------------------ SC_SHADERSTATE: u32NumIntVSConst = 0 SC_SHADERSTATE: u32NumIntPSConst = 0 SC_SHADERSTATE: u32NumIntGSConst = 0 SC_SHADERSTATE: u32NumBoolVSConst = 0 SC_SHADERSTATE: u32NumBoolPSConst = 0 SC_SHADERSTATE: u32NumBoolGSConst = 0 SC_SHADERSTATE: u32NumFloatVSConst = 0 SC_SHADERSTATE: u32NumFloatPSConst = 0 SC_SHADERSTATE: u32NumFloatGSConst = 0 fConstantsAvailable = 0 iConstantsAvailable = 0 bConstantsAvailable = 0 u32SCOptions[0] = 0x01A00000 SCOption_IGNORE_SAMPLE_L_BUG SCOption_FLOAT_DO_NOT_DIST SCOption_FLOAT_DO_NOT_REASSOC u32SCOptions[1] = 0x00202000 SCOption_R600_ERROR_ON_DOUBLE_MEMEXP SCOption_SET_VPM_FOR_SCATTER u32SCOptions[2] = 0x00020041 SCOption_R800_UAV_NONARRAY_FIXUP SCOption_R800_UAV_NONUAV_SYNC_WORKAROUND_BUG216513_1 SCOption_R900_BRANCH_IN_NESTED_LOOPS_WORKAROUND_BUG281276 ; -------- Disassembly -------------------- 00 ALU: ADDR(32) CNT(10) KCACHE0(CB1:0-15) KCACHE1(CB0:0-15) 0 y: ADD_INT ____, KC0[1].x, 56 z: ADD_INT ____, KC0[1].x, 12 1 y: LSHR R0.y, PV0.z, 2 w: LSHR R0.w, PV0.y, 2 2 x: MULLO_INT R1.x, R1.x, KC1[1].x y: MULLO_INT ____, R1.x, KC1[1].x z: MULLO_INT ____, R1.x, KC1[1].x w: MULLO_INT ____, R1.x, KC1[1].x 01 TEX: ADDR(64) CNT(2) 3 VFETCH R2.x___, R0.w, fc153 FETCH_TYPE(NO_INDEX_OFFSET) 4 VFETCH R3.x___, R0.y, fc153 FETCH_TYPE(NO_INDEX_OFFSET) 02 ALU: ADDR(42) CNT(11) KCACHE0(CB0:0-15) 5 w: ADD_INT ____, R0.x, R1.x 6 x: AND_INT R0.x, R2.x, R3.x z: ADD_INT ____, PV5.w, KC0[6].x 7 y: LSHL ____, PV6.z, 2 w: LSHL ____, PV6.z, 3 8 z: ADD_INT ____, PV7.y, PV7.w 9 x: LSHR R2.x, PV8.z, 2 10 x: MOV R3.x, R3.x y: MOV R3.y, R0.x VEC_120 03 MEM_RAT_CACHELESS_STORE_DWORD__NI: RAT(8)[R2].xy__, R3, ARRAY_SIZE(4) MARK VPM 04 END END_OF_PROGRAM ; ----------------- CS Data ------------------------ ; Input Semantic Mappings ; No input mappings GprPoolSize = 0 CodeLen = 544;Bytes PGM_END_CF = 0; words(64 bit) PGM_END_ALU = 0; words(64 bit) PGM_END_FETCH = 0; words(64 bit) MaxScratchRegsNeeded = 0 ;AluPacking = 0.0 ;AluClauses = 0 ;PowerThrottleRate = 0.0 ; texResourceUsage[0] = 0x00000000 ; texResourceUsage[1] = 0x00000000 ; texResourceUsage[2] = 0x00000000 ; texResourceUsage[3] = 0x00000000 ; texResourceUsage[4] = 0x00000000 ; texResourceUsage[5] = 0x00000000 ; texResourceUsage[6] = 0x00000000 ; texResourceUsage[7] = 0x00000000 ; fetch4ResourceUsage[0] = 0x00000000 ; fetch4ResourceUsage[1] = 0x00000000 ; fetch4ResourceUsage[2] = 0x00000000 ; fetch4ResourceUsage[3] = 0x00000000 ; fetch4ResourceUsage[4] = 0x00000000 ; fetch4ResourceUsage[5] = 0x00000000 ; fetch4ResourceUsage[6] = 0x00000000 ; fetch4ResourceUsage[7] = 0x00000000 ; texSamplerUsage = 0x00000000 ; constBufUsage = 0x00000000 ResourcesAffectAlphaOutput[0] = 0x00000000 ResourcesAffectAlphaOutput[1] = 0x00000000 ResourcesAffectAlphaOutput[2] = 0x00000000 ResourcesAffectAlphaOutput[3] = 0x00000000 ResourcesAffectAlphaOutput[4] = 0x00000000 ResourcesAffectAlphaOutput[5] = 0x00000000 ResourcesAffectAlphaOutput[6] = 0x00000000 ResourcesAffectAlphaOutput[7] = 0x00000000 ;SQ_PGM_RESOURCES = 0x30000104 SQ_PGM_RESOURCES:NUM_GPRS = 4 SQ_PGM_RESOURCES:STACK_SIZE = 1 SQ_PGM_RESOURCES:PRIME_CACHE_ENABLE = 1 ;SQ_PGM_RESOURCES_2 = 0x000000C0 SQ_LDS_ALLOC:SIZE = 0x00000000 ; RatOpIsUsed = 0x900 ; NumThreadPerGroupFlattened = 256 ; SetBufferForNumGroup = true

                                                        • Can I use BFI_INT directly from IL ?
                                                          MicahVillmow
                                                          corry, looks like in your example, the code that you want to be optimized into a BFI is being optimized away before the BFI pattern can be generated because there is a typo and t3 is never initialized. Once you initialize t3 correctly, BFI gets generated.
                                                            • Can I use BFI_INT directly from IL ?
                                                              corry

                                                              Fixed, no bfi

                                                              You *SURE* this isn't an internal build only?  Or is it architecture specific?  Enabled for evergreens, but not caymans for some reason? 

                                                              Also fixed the pointer....still no BFI...

                                                              //OpenCL Below.... //tried with and without this... #pragma OPENCL EXTENSION cl_amd_media_ops : enable __kernel void Junk(__global unsigned int * output, __global unsigned int * input, const unsigned int multiplier) { uint tid = get_global_id(0); __global uint* mySpot=output+tid*8; uint t1, t2, t3, t4, t5; t1=input[3]; t2=input[8]; t3=input[14]; t5=(t1 & t2) | (t3 & ~t2) ; t4=bitselect(t1, t2, t3); mySpot[tid] = t4; mySpot[tid+1]=t5; } //IL Below.... mdef(16383)_out(1)_in(2) mov r0, in0 mov r1, in1 div_zeroop(infinity) r0.x___, r0.x, r1.x mov out0, r0 mend il_cs_2_0 dcl_cb cb0[10] ; Constant buffer that holds ABI data dcl_literal l0, 4, 1, 2, 3 dcl_literal l1, 0x00FFFFFF, -1, -2, -3 dcl_literal l2, 0x0000FFFF, 0xFFFFFFFE,0x000000FF,0xFFFFFFFC dcl_literal l3, 24, 16, 8, 0xFFFFFFFF dcl_literal l4, 0xFFFFFF00, 0xFFFF0000, 0xFF00FFFF, 0xFFFF00FF dcl_literal l5, 0, 4, 8, 12 dcl_literal l6, 32, 32, 32, 32 dcl_literal l7, 24, 31, 16, 31 call 1024;$ endmain func 1024 ; __OpenCL_Junk_kernel mov r1013, cb0[8].x mov r1019, l1.0 dcl_max_thread_per_group 256 dcl_raw_uav_id(11) dcl_arena_uav_id(8) mov r0.z, vThreadGrpIdFlat.x mov r1022.xyz0, vTidInGrp.xyz mov r1023.xyz0, vThreadGrpId.xyz imad r1021.xyz0, r1023.xyz0, cb0[1].xyz0, r1022.xyz0 iadd r1021.xyz0, r1021.xyz0, cb0[6].xyz0 iadd r1023.xyz0, r1023.xyz0, cb0[7].xyz0 mov r1023.w, r0.z ishl r1023.w, r1023.w, l0.z mov r1018.x, l0.0 udiv r1024.xyz, r1021.xyz, cb0[10].xyz imad r1025.xyz, r1023.xyz, cb0[1].xyz, r1022.xyz dcl_literal l15, 0x00000001, 0x00000001, 0x00000001, 0x00000001; f32:i32 1 dcl_literal l13, 0x00000002, 0x00000002, 0x00000002, 0x00000002; f32:i32 2 dcl_literal l12, 0x00000003, 0x00000003, 0x00000003, 0x00000003; f32:i32 3 dcl_literal l9, 0x0000000c, 0x0000000c, 0x0000000c, 0x0000000c; f32:i32 12 dcl_literal l10, 0x00000020, 0x00000020, 0x00000020, 0x00000020; f32:i32 32 dcl_literal l11, 0x00000038, 0x00000038, 0x00000038, 0x00000038; f32:i32 56 dcl_literal l14, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff; f32:i32 4294967295 dcl_cb cb1[3] ; Kernel arg setup: output mov r1, cb1[0] ; Kernel arg setup: input mov r2, cb1[1] ; Kernel arg setup: multiplier mov r3, cb1[2] call 1027 ; Junk ret endfunc ; __OpenCL_Junk_kernel ;ARGSTART:__OpenCL_Junk_kernel ;version:2:0:74 ;device:cayman ;uniqueid:1024 ;memory:hwprivate:0 ;memory:hwregion:0 ;memory:hwlocal:0 ;pointer:output:i32:1:1:0:uav:11:8 ;pointer:input:i32:1:1:16:uav:11:8 ;value:multiplier:i32:1:1:32 ;function:1:1027 ;uavid:11 ;ARGEND:__OpenCL_Junk_kernel func 1027 ; Junk ; @__OpenCL_Junk_kernel ; BB#0: ; %entry mov r255, l9.xxxx iadd r255.x___, r2.xxxx, r255.xxxx mov r256, r1021.xyz0 mov r1010.x___, r255.xxxx uav_raw_load_id(11)_cached r1011.x___, r1010.xxxx mov r255.x___, r1011.xxxx mov r257, l10.xxxx iadd r257.x___, r2.xxxx, r257.xxxx mov r1010.x___, r257.xxxx uav_raw_load_id(11)_cached r1011.x___, r1010.xxxx mov r257.x___, r1011.xxxx ixor r258.x___, r257.xxxx, r255.xxxx mov r259, l11.xxxx iadd r253.x___, r2.xxxx, r259.xxxx mov r1010.x___, r253.xxxx uav_raw_load_id(11)_cached r1011.x___, r1010.xxxx mov r253.x___, r1011.xxxx iand r258.x___, r258.xxxx, r253.xxxx ixor r258.x___, r258.xxxx, r255.xxxx mov r256, r256.x000 mov r259, l12.xxxx ishl r259.x___, r256.xxxx, r259.xxxx iadd r256.x___, r259.xxxx, r256.xxxx mov r259, l13.xxxx ishl r260.x___, r256.xxxx, r259.xxxx iadd r260.x___, r1.xxxx, r260.xxxx mov r1011.x___, r258.xxxx mov r1010.x___, r260.xxxx uav_raw_store_id(11) mem.x___, r1010.xxxx, r1011.xxxx iand r255.x___, r257.xxxx, r255.xxxx mov r258, l14.xxxx ixor r257.x___, r257.xxxx, r258.xxxx iand r253.x___, r253.xxxx, r257.xxxx ior r253.x___, r253.xxxx, r255.xxxx mov r255, l15.xxxx iadd r255.x___, r256.xxxx, r255.xxxx ishl r255.x___, r255.xxxx, r259.xxxx iadd r254.x___, r1.xxxx, r255.xxxx mov r1011.x___, r253.xxxx mov r1010.x___, r254.xxxx uav_raw_store_id(11) mem.x___, r1010.xxxx, r1011.xxxx ret endfunc ; Junk ;ARGSTART:Junk ;uniqueid:1027 ;memory:hwregion:0 ;memory:hwlocal:0 ;ARGEND:Junk end //ISA Below.... ShaderType = IL_SHADER_COMPUTE TargetChip = c ; ------------- SC_SRCSHADER Dump ------------------ SC_SHADERSTATE: u32NumIntVSConst = 0 SC_SHADERSTATE: u32NumIntPSConst = 0 SC_SHADERSTATE: u32NumIntGSConst = 0 SC_SHADERSTATE: u32NumBoolVSConst = 0 SC_SHADERSTATE: u32NumBoolPSConst = 0 SC_SHADERSTATE: u32NumBoolGSConst = 0 SC_SHADERSTATE: u32NumFloatVSConst = 0 SC_SHADERSTATE: u32NumFloatPSConst = 0 SC_SHADERSTATE: u32NumFloatGSConst = 0 fConstantsAvailable = 0 iConstantsAvailable = 0 bConstantsAvailable = 0 u32SCOptions[0] = 0x01A00000 SCOption_IGNORE_SAMPLE_L_BUG SCOption_FLOAT_DO_NOT_DIST SCOption_FLOAT_DO_NOT_REASSOC u32SCOptions[1] = 0x00202000 SCOption_R600_ERROR_ON_DOUBLE_MEMEXP SCOption_SET_VPM_FOR_SCATTER u32SCOptions[2] = 0x00020041 SCOption_R800_UAV_NONARRAY_FIXUP SCOption_R800_UAV_NONUAV_SYNC_WORKAROUND_BUG216513_1 SCOption_R900_BRANCH_IN_NESTED_LOOPS_WORKAROUND_BUG281276 ; -------- Disassembly -------------------- 00 ALU: ADDR(32) CNT(13) KCACHE0(CB1:0-15) KCACHE1(CB0:0-15) 0 x: ADD_INT ____, KC0[1].x, 56 y: ADD_INT ____, KC0[1].x, 12 z: ADD_INT ____, KC0[1].x, 32 1 x: LSHR R2.x, PV0.x, 2 y: LSHR R0.y, PV0.z, 2 w: LSHR R0.w, PV0.y, 2 2 x: MULLO_INT R1.x, R1.x, KC1[1].x y: MULLO_INT ____, R1.x, KC1[1].x z: MULLO_INT ____, R1.x, KC1[1].x w: MULLO_INT ____, R1.x, KC1[1].x 01 TEX: ADDR(80) CNT(3) 3 VFETCH R4.x___, R0.w, fc153 FETCH_TYPE(NO_INDEX_OFFSET) 4 VFETCH R3.x___, R0.y, fc153 FETCH_TYPE(NO_INDEX_OFFSET) 5 VFETCH R2.x___, R2.x, fc153 FETCH_TYPE(NO_INDEX_OFFSET) 02 ALU: ADDR(45) CNT(23) KCACHE0(CB0:0-15) KCACHE1(CB1:0-15) 6 y: XOR_INT ____, -1, R3.x w: ADD_INT ____, R0.x, R1.x VEC_021 7 x: AND_INT R3.x, R4.x, R3.x y: AND_INT R0.y, R2.x, PV6.y VEC_201 z: ADD_INT R0.z, PV6.w, KC0[6].x w: XOR_INT ____, R4.x, R3.x 8 x: AND_INT ____, PV7.w, R2.x w: LSHL ____, PV7.z, 3 9 x: XOR_INT R4.x, R4.x, PV8.x z: ADD_INT ____, R0.z, PV8.w 10 x: OR_INT R2.x, R3.x, R0.y y: LSHL ____, PV9.z, 2 w: ADD_INT ____, PV9.z, 1 11 z: LSHL ____, PV10.w, 2 w: ADD_INT ____, KC1[0].x, PV10.y 12 x: LSHR R3.x, PV11.w, 2 y: ADD_INT ____, KC1[0].x, PV11.z 13 x: LSHR R0.x, PV12.y, 2 03 MEM_RAT_CACHELESS_STORE_DWORD__NI: RAT(11)[R3].x___, R4, ARRAY_SIZE(4) MARK VPM 04 MEM_RAT_CACHELESS_STORE_DWORD__NI: RAT(11)[R0].x___, R2, ARRAY_SIZE(4) MARK VPM 05 END END_OF_PROGRAM ; ----------------- CS Data ------------------------ ; Input Semantic Mappings ; No input mappings GprPoolSize = 0 CodeLen = 688;Bytes PGM_END_CF = 0; words(64 bit) PGM_END_ALU = 0; words(64 bit) PGM_END_FETCH = 0; words(64 bit) MaxScratchRegsNeeded = 0 ;AluPacking = 0.0 ;AluClauses = 0 ;PowerThrottleRate = 0.0 ; texResourceUsage[0] = 0x00000000 ; texResourceUsage[1] = 0x00000000 ; texResourceUsage[2] = 0x00000000 ; texResourceUsage[3] = 0x00000000 ; texResourceUsage[4] = 0x00000000 ; texResourceUsage[5] = 0x00000000 ; texResourceUsage[6] = 0x00000000 ; texResourceUsage[7] = 0x00000000 ; fetch4ResourceUsage[0] = 0x00000000 ; fetch4ResourceUsage[1] = 0x00000000 ; fetch4ResourceUsage[2] = 0x00000000 ; fetch4ResourceUsage[3] = 0x00000000 ; fetch4ResourceUsage[4] = 0x00000000 ; fetch4ResourceUsage[5] = 0x00000000 ; fetch4ResourceUsage[6] = 0x00000000 ; fetch4ResourceUsage[7] = 0x00000000 ; texSamplerUsage = 0x00000000 ; constBufUsage = 0x00000000 ResourcesAffectAlphaOutput[0] = 0x00000000 ResourcesAffectAlphaOutput[1] = 0x00000000 ResourcesAffectAlphaOutput[2] = 0x00000000 ResourcesAffectAlphaOutput[3] = 0x00000000 ResourcesAffectAlphaOutput[4] = 0x00000000 ResourcesAffectAlphaOutput[5] = 0x00000000 ResourcesAffectAlphaOutput[6] = 0x00000000 ResourcesAffectAlphaOutput[7] = 0x00000000 ;SQ_PGM_RESOURCES = 0x30000105 SQ_PGM_RESOURCES:NUM_GPRS = 5 SQ_PGM_RESOURCES:STACK_SIZE = 1 SQ_PGM_RESOURCES:PRIME_CACHE_ENABLE = 1 ;SQ_PGM_RESOURCES_2 = 0x000000C0 SQ_LDS_ALLOC:SIZE = 0x00000000 ; RatOpIsUsed = 0x800 ; NumThreadPerGroupFlattened = 256 ; SetBufferForNumGroup = true

                                                            • Can I use BFI_INT directly from IL ?
                                                              MicahVillmow
                                                              The optimization to generate BFI will be enabled in 2.6. The IL instruction is there but only available at the CAL level.
                                                                • Can I use BFI_INT directly from IL ?
                                                                  corry

                                                                  Can you give me an example usage of the instruction?  I tried it at the opencl level because I could not get it working at the CAL level, and gat3way seemed to say he had success seeing the instruction, but from the post, seemed like the luck was with OpenCL.  I just want to see something that should work and test it on my end.  If you can't do that, could you install a machine fresh and verify with the catalyst release available on the website that it works?  If I have to I'll blow away my dev box completly, but I don't want to do so needlessly.  I came pretty close to that uninstalling, and deleting manually.  Depends still shows the sysWOW64 and system32 aitcalcl dlls being loaded, and the only place they came from was the 11.9 driver, so I don't know how it could be a configuration issue on my end, but I'm more than open to suggestions!

                                                                  Anyone else want to pipe up with some code that uses bfi and compiles with the 11.9 cal compiler?

                                                                  I tried as you said just bfi, then I figured WTH, and tried adding ibit_ and ubit_ following the pattern of some of the other IL instructions for bit operations to no avail.  should it literally be "bfi dst, src0, src1, src2"?

                                                                  and yeah, if you are looking at the time of this post, you are reading the timestamp on this correctly, its 2:30am here, and yes, I seem to be fighting insomnia again.  No posting on the AMD developer forums isn't part of that battle.  Its more of a "tactical regrouping", (read: temporary retreat) so I can continue the fight later :)

                                                                • Can I use BFI_INT directly from IL ?
                                                                  MicahVillmow
                                                                  corry,
                                                                  It is interesting that you are seeing dbg_string and dbg_line but not bfi as bfi was added to the compiler almost a month before those two instructions. Looking at the source for the release drivers, the bfi instruction is there.
                                                                  The format is ' fi dst, src0, src1, src2'.

                                                                  Micah
                                                                    • Can I use BFI_INT directly from IL ?
                                                                      corry

                                                                       

                                                                      Originally posted by: MicahVillmow corry, It is interesting that you are seeing dbg_string and dbg_line but not bfi as bfi was added to the compiler almost a month before those two instructions. Looking at the source for the release drivers, the bfi instruction is there. The format is 'b fi dst, src0, src1, src2'. Micah


                                                                      Give it a shot yourself, here's exactly what I've done to test.

                                                                      Download the catalyst 11.9 or 11.10 preview2 drivers from the website.

                                                                      Under cygwin, do the following:

                                                                      Cygwin can't see the 64 bit system32 directory in Vista, so start->search C:\Windows\system32
                                                                      find aticaldd64.dll
                                                                      right click, and select copy
                                                                      browse to C:\
                                                                      Right click, paste
                                                                      cd /cygdrive/c
                                                                      strings aticaldd64.dll > calstrings.txt
                                                                      vim calstrings.txt
                                                                      /bfi
                                                                      Notice the "E486: Pattern not found: bfi" at the bottom of the terminal
                                                                      To check that this is the correct dll to be searching, while still in vim, enter the following:
                                                                      /dbg_string
                                                                      Notice a match is found, and is inside a complete list of IL instructions
                                                                      enter :q! to exit (! just in case you accidently changed anything)

                                                                      Browse back to the C:\windows\System32 directory, right click on aticaldd64.dll and select "Properties" 
                                                                      Click on Details
                                                                      Verify "File version" reads 6.14.10.1589, "Product Name" reads ATI CAL DD, and "Date modified" reads 10/6/2011 10:52PM

                                                                      I don't know what's going on with the build process, but given your insistance its in there, and my lack of testing on linux, perhaps that's the issue?  Build process in linux was set to -DENABLE_BFI, and windows isn't?  I think I downloaded the linux drivers as well, so I could untar them and run strings there...I'll do that in a little bit...

                                                                        • Can I use BFI_INT directly from IL ?
                                                                          corry

                                                                          I was unable to find the preview driver for linux at least not when trying from a windows box, but you said it should have been in 11.9, so I grabbed that, pulled out the tgz data, and untgz'd it. Then found libaticaldd.so in arch/x86_64/usr/lib64, ran strings on it with the same effect, but that had no dbg_string in it.

                                                                          For giggles, I tried the 32 bit versions as well.  Same exact results 11.10/windows has dbg_string, but no bfi, linux 11.9 doesn't have dbg_string. 

                                                                          So Here's to hoping its a simple fix, and will be available in a few days in a preview 3 or 11.10 release? 

                                                                      • Can I use BFI_INT directly from IL ?
                                                                        MicahVillmow
                                                                        airsidelimo,
                                                                        That is one of the patterns that I currently match against. Expect it to be matched in SDK 2.6.
                                                                        • Can I use BFI_INT directly from IL ?
                                                                          MicahVillmow
                                                                          corry,
                                                                          I went back through the sources for what was released. There are two compilers that can generate BFI_INT. The CAL compiler and the OpenCL compiler.
                                                                          The CAL compiler, which does not use the IL text format and thus 'bfi' won't show up via strings introduced an optimization to generate BFI_INT in Catalyst 11.7(SDK 2.5). This is a limited optimization as the rest had not gone in yet. Optimization's to hit more cases were added in 11.9. Again, bfi here is not going to show up in strings, but it will in the ISA(the ISA string token is BFI_INT).
                                                                          In 11.10 release, the CAL compiler will expose bfi to the OpenCL compiler and in SDK 2.6, the OpenCL compiler will take advantage of/generate the bfi instruction.

                                                                          That being said, 11.10 has not been released yet and I'm not sure where the preview release code is coming from.
                                                                            • Can I use BFI_INT directly from IL ?
                                                                              corry

                                                                              It will be in 11.10 you say?  That makes my day :)

                                                                              That said though, I thought earlier you said, and I'm not going to go find the quote, that OpenCL always compiled to IL, not ISA, and that the OpenCL runtime was based on CAL (thus we can expect no new docs, etc, but expect the runtime to remain).  That said, it sounds like you're now saying OpenCL can go straight to ISA, since if bfi was not in the CAL IL compiler, it couldn't be in the OpenCL compiler if the OpenCL compiler only went to IL.  Follow?  Therefore no matter what you did in OpenCL, you'd never see BFI_INT if you dumped your ISA from the device when it ran the kernel.  Make sense?  I take it either something has changed, or I totally misunderstood how the OpenCL compiler generates its code...(probably the latter...)

                                                                              Anyhow, thanks for looking into it.  I won't be using it the way these other guys here are using it, but I believe it will still make a pretty hefty difference!

                                                                               

                                                                            • Can I use BFI_INT directly from IL ?
                                                                              MicahVillmow
                                                                              OpenCL compiler compiles from CL to IL and then the CAL compiler compilers from IL to ISA. The CAL compiler does optimizations, which is where the bfi is produced prior to OpenCL producing it. So you can see a BFI_INT in the ISA without having it in the IL.
                                                                                • Can I use BFI_INT directly from IL ?
                                                                                  corry

                                                                                  Sad...I'm not seeing it in my ISA :)  But if it will be available from IL in 11.10, I can substitute around it until its available rather than try to fit a pattern...

                                                                                   

                                                                                    • Can I use BFI_INT directly from IL ?
                                                                                      hazeman

                                                                                      This optimization ( if it works at all ) is most unreliable.

                                                                                      Following IL code with driver 11.9 ( linux, 5850 ) doesn't produce BFI_INT at all ( basic pattern (A&C) | (B&~C) ).

                                                                                      iand r39,r2,r4
                                                                                      inot r32,r4
                                                                                      iand r35,r3,r32
                                                                                      ior r44,r39,r35

                                                                                      PS. Micah pls recheck if the bfi is really available.

                                                                                       

                                                                                      il_cs dcl_num_thread_per_group 128 dcl_raw_uav_id(1) dcl_raw_uav_id(2) dcl_literal l0, 0x0, 0x0, 0x0, 0x0 dcl_literal l3, 0x4, 0x0, 0x0, 0x0 dcl_literal l4, 0x8, 0x0, 0x0, 0x0 dcl_literal l2, 0x10, 0x0, 0x0, 0x0 dcl_literal l1, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff mov r1.x,vAbsTid.x umul r6.x,l2.x,r1.x uav_raw_load_id(2)_cached r9.xyzw,r6.x mov r10.xyzw,r9.xyzw mov r2,r10 umul r13.x,l2.x,r1.x iadd r17.x,r13.x,l3.x uav_raw_load_id(2)_cached r19.xyzw,r17.x mov r20.xyzw,r19.xyzw mov r3,r20 umul r23.x,l2.x,r1.x iadd r27.x,r23.x,l4.x uav_raw_load_id(2)_cached r29.xyzw,r27.x mov r30.xyzw,r29.xyzw mov r4,r30 iand r39,r2,r4 <---- inot r32,r4 <---- iand r35,r3,r32 <---- ior r44,r39,r35 <---- umul r50.x,l2.x,r1.x uav_raw_store_id(1) mem.xyzw,r50.x,r44 end ; -------- Disassembly -------------------- 00 ALU: ADDR(32) CNT(13) 0 w: LSHL ____, R1.x, 7 1 z: ADD_INT ____, R0.x, PV0.w 2 t: MULLO_UINT ____, 16, PV1.z 3 x: LSHR R3.x, PS2, 2 y: ADD_INT ____, PS2, 8 w: ADD_INT ____, PS2, 4 4 x: LSHR R0.x, PV3.y, 2 z: LSHR R0.z, PV3.w, 2 01 TEX: ADDR(64) CNT(3) 5 VFETCH R2, R0.x, fc0 FORMAT(32_32_32_32_FLOAT) MEGA(16) FETCH_TYPE(NO_INDEX_OFFSET) 6 VFETCH R1, R3.x, fc0 FORMAT(32_32_32_32_FLOAT) MEGA(16) FETCH_TYPE(NO_INDEX_OFFSET) 7 VFETCH R0, R0.z, fc0 FORMAT(32_32_32_32_FLOAT) MEGA(16) FETCH_TYPE(NO_INDEX_OFFSET) 02 ALU: ADDR(45) CNT(16) 8 x: NOT_INT ____, R2.w y: NOT_INT ____, R2.z z: NOT_INT ____, R2.y w: NOT_INT ____, R2.x t: AND_INT T0.w, R1.x, R2.x 9 x: AND_INT T0.x, R0.w, PV8.x y: AND_INT T0.y, R0.z, PV8.y z: AND_INT T1.z, R0.y, PV8.z w: AND_INT ____, R0.x, PV8.w t: AND_INT T0.z, R1.y, R2.y 10 x: AND_INT ____, R1.w, R2.w y: AND_INT ____, R1.z, R2.z t: OR_INT R1.x, T0.w, PV9.w 11 y: OR_INT R1.y, T0.z, T1.z z: OR_INT R1.z, PV10.y, T0.y w: OR_INT R1.w, PV10.x, T0.x 03 MEM_RAT_CACHELESS_STORE_RAW: RAT(1)[R3], R1, ARRAY_SIZE(4) MARK VPM END_OF_PROGRAM

                                                                                  • Can I use BFI_INT directly from IL ?
                                                                                    MicahVillmow
                                                                                    hazeman,
                                                                                    The pattern is where C and ~C are literals. I've already reported that this case is missing, but it has not been fixed yet.
                                                                                      • Can I use BFI_INT directly from IL ?
                                                                                        gat3way

                                                                                        That explains it. In my case, C is a literal in step 1,2 and probably 3 of the MD5 algorithm, so that I was able to see those BFI_INTs. I was rather surprised though.

                                                                                          • Can I use BFI_INT directly from IL ?
                                                                                            corry

                                                                                            <homer_simpson_voice>And that solves the mystery of the missing bfi instruction</homer_simpson_voice><Random_Car_bursts_into_flames_and_explodes/>

                                                                                              • Can I use BFI_INT directly from IL ?
                                                                                                hazeman

                                                                                                Sorry Micah but for literal C it also doesn't work.

                                                                                                I've tested

                                                                                                ----------------------------------------

                                                                                                dcl_literal l5, 0x1, 0x1, 0x1, 0x1

                                                                                                iand r38,r2,l5
                                                                                                inot r33,l5
                                                                                                iand r34,r3,r33
                                                                                                ior r41,r38,r34

                                                                                                -----------------------------------------

                                                                                                and

                                                                                                ------------------------------------------

                                                                                                dcl_literal l5, 0x1, 0x1, 0x1, 0x1

                                                                                                dcl_literal l6, 0xfffffffe, 0xfffffffe, 0xfffffffe, 0xfffffffe

                                                                                                iand r37,r2,l5
                                                                                                iand r33,r3,l6
                                                                                                ior r40,r37,r33

                                                                                                -----------------------------------------

                                                                                                gat3way could you post relevant part of IL code generated from your kernel ?

                                                                                                Micah i think that everyone wait for bfi instruction available in IL. It would be best and most reliable solution. So please make it working as fast as you can.

                                                                                                 

                                                                                                  • Can I use BFI_INT directly from IL ?
                                                                                                    gat3way

                                                                                                    Hello,

                                                                                                    I tried to reproduce that in a simple kernel and I failed. Then I disabled the BFI_INT patching of my kernels, set GPU_DUMP_DEVICE_KERNELS=3 and recompiled them. For some of them I was able to observe BFI_INT generated in the ISA. I don't know if that would be helpful to you but there is the IL dump of one of them whose ISA contained BFI_INT:

                                                                                                     

                                                                                                     

                                                                                                    mdef(16383)_out(1)_in(2) mov r0, in0 mov r1, in1 div_zeroop(infinity) r0.x___, r0.x, r1.x mov out0, r0 mend il_cs_2_0 dcl_cb cb0[10] ; Constant buffer that holds ABI data dcl_literal l0, 4, 1, 2, 3 dcl_literal l1, 0x00FFFFFF, -1, -2, -3 dcl_literal l2, 0x0000FFFF, 0xFFFFFFFE,0x000000FF,0xFFFFFFFC dcl_literal l3, 24, 16, 8, 0xFFFFFFFF dcl_literal l4, 0xFFFFFF00, 0xFFFF0000, 0xFF00FFFF, 0xFFFF00FF dcl_literal l5, 0, 4, 8, 12 dcl_literal l6, 32, 32, 32, 32 dcl_literal l7, 24, 31, 16, 31 call 1024;$ endmain func 1024 ; __OpenCL_sha1_short_kernel mov r1013, cb0[8].x mov r1019, l1.0 dcl_num_thread_per_group 64, 1, 1 dcl_raw_uav_id(11) dcl_arena_uav_id(8) mov r0.z, vThreadGrpIdFlat.x mov r1022.xyz0, vTidInGrp.xyz mov r1023.xyz0, vThreadGrpId.xyz imad r1021.xyz0, r1023.xyz0, cb0[1].xyz0, r1022.xyz0 iadd r1021.xyz0, r1021.xyz0, cb0[6].xyz0 iadd r1023.xyz0, r1023.xyz0, cb0[7].xyz0 mov r1023.w, r0.z ishl r1023.w, r1023.w, l0.z mov r1018.x, l0.0 dcl_literal l46, 0x00000000, 0x00000000, 0x00000000, 0x00000000; f32:i32 0 dcl_literal l31, 0x00000001, 0x00000001, 0x00000001, 0x00000001; f32:i32 1 dcl_literal l12, 0x00000002, 0x00000002, 0x00000002, 0x00000002; f32:i32 2 dcl_literal l10, 0x00000003, 0x00000003, 0x00000003, 0x00000003; f32:i32 3 dcl_literal l30, 0x00000004, 0x00000004, 0x00000004, 0x00000004; f32:i32 4 dcl_literal l15, 0x00000005, 0x00000005, 0x00000005, 0x00000005; f32:i32 5 dcl_literal l26, 0x00000006, 0x00000006, 0x00000006, 0x00000006; f32:i32 6 dcl_literal l27, 0x00000007, 0x00000007, 0x00000007, 0x00000007; f32:i32 7 dcl_literal l13, 0x00000008, 0x00000008, 0x00000008, 0x00000008; f32:i32 8 dcl_literal l18, 0x00000009, 0x00000009, 0x00000009, 0x00000009; f32:i32 9 dcl_literal l21, 0x0000000a, 0x0000000a, 0x0000000a, 0x0000000a; f32:i32 10 dcl_literal l23, 0x0000000b, 0x0000000b, 0x0000000b, 0x0000000b; f32:i32 11 dcl_literal l14, 0x00000010, 0x00000010, 0x00000010, 0x00000010; f32:i32 16 dcl_literal l16, 0x00000018, 0x00000018, 0x00000018, 0x00000018; f32:i32 24 dcl_literal l33, 0x0000001b, 0x0000001b, 0x0000001b, 0x0000001b; f32:i32 27 dcl_literal l40, 0x0000001f, 0x0000001f, 0x0000001f, 0x0000001f; f32:i32 31 dcl_literal l11, 0x00000020, 0x00000020, 0x00000020, 0x00000020; f32:i32 32 dcl_literal l52, 0x00000030, 0x00000030, 0x00000030, 0x00000030; f32:i32 48 dcl_literal l53, 0x00000040, 0x00000040, 0x00000040, 0x00000040; f32:i32 64 dcl_literal l54, 0x00000050, 0x00000050, 0x00000050, 0x00000050; f32:i32 80 dcl_literal l55, 0x00000070, 0x00000070, 0x00000070, 0x00000070; f32:i32 112 dcl_literal l29, 0x00000080, 0x00000080, 0x00000080, 0x00000080; f32:i32 128 dcl_literal l56, 0x00000090, 0x00000090, 0x00000090, 0x00000090; f32:i32 144 dcl_literal l17, 0x000000ff, 0x000000ff, 0x000000ff, 0x000000ff; f32:i32 255 dcl_literal l20, 0x00008000, 0x00008000, 0x00008000, 0x00008000; f32:i32 32768 dcl_literal l49, 0x001fffe0, 0x001fffe0, 0x001fffe0, 0x001fffe0; f32:i32 2097120 dcl_literal l50, 0x003fffc0, 0x003fffc0, 0x003fffc0, 0x003fffc0; f32:i32 4194240 dcl_literal l51, 0x005fffa0, 0x005fffa0, 0x005fffa0, 0x005fffa0; f32:i32 6291360 dcl_literal l22, 0x00800000, 0x00800000, 0x00800000, 0x00800000; f32:i32 8388608 dcl_literal l24, 0x00ff00ff, 0x00ff00ff, 0x00ff00ff, 0x00ff00ff; f32:i32 16711935 dcl_literal l38, 0x10325476, 0x10325476, 0x10325476, 0x10325476; f32:i32 271733878 dcl_literal l44, 0x31a7e4d7, 0x31a7e4d7, 0x31a7e4d7, 0x31a7e4d7; f32:i32 833086679 dcl_literal l36, 0x5a827999, 0x5a827999, 0x5a827999, 0x5a827999; f32:i32 1518500249 dcl_literal l35, 0x5c8dbeee, 0x5c8dbeee, 0x5c8dbeee, 0x5c8dbeee; f32:i32 1552793326 dcl_literal l34, 0x67452301, 0x67452301, 0x67452301, 0x67452301; f32:i32 1732584193 dcl_literal l41, 0x6ed9eba1, 0x6ed9eba1, 0x6ed9eba1, 0x6ed9eba1; f32:i32 1859775393 dcl_literal l25, 0x80000000, 0x80000000, 0x80000000, 0x80000000; f32:i32 2147483648 dcl_literal l42, 0x8f1bbcdc, 0x8f1bbcdc, 0x8f1bbcdc, 0x8f1bbcdc; f32:i32 2400959708 dcl_literal l39, 0x98badcfe, 0x98badcfe, 0x98badcfe, 0x98badcfe; f32:i32 2562383102 dcl_literal l45, 0xba306d5f, 0xba306d5f, 0xba306d5f, 0xba306d5f; f32:i32 3123735903 dcl_literal l47, 0xc3d2e1f0, 0xc3d2e1f0, 0xc3d2e1f0, 0xc3d2e1f0; f32:i32 3285377520 dcl_literal l43, 0xca62c1d6, 0xca62c1d6, 0xca62c1d6, 0xca62c1d6; f32:i32 3395469782 dcl_literal l37, 0xefcdab89, 0xefcdab89, 0xefcdab89, 0xefcdab89; f32:i32 4023233417 dcl_literal l19, 0xff000000, 0xff000000, 0xff000000, 0xff000000; f32:i32 4278190080 dcl_literal l28, 0xff0000ff, 0xff0000ff, 0xff0000ff, 0xff0000ff; f32:i32 4278190335 dcl_literal l32, 0xff00ff00, 0xff00ff00, 0xff00ff00, 0xff00ff00; f32:i32 4278255360 dcl_literal l48, 0xffffe000, 0xffffe000, 0xffffe000, 0xffffe000; f32:i32 4294959104 dcl_cb cb1[10] ; Kernel arg setup: dst mov r1, cb1[0] ; Kernel arg setup: input mov r2, cb1[1] ; Kernel arg setup: size mov r3, cb1[2] ; Kernel arg setup: chbase mov r4, cb1[3] mov r5, cb1[4] ; Kernel arg setup: found_ind mov r6, cb1[5] ; Kernel arg setup: bitmaps mov r7, cb1[6] ; Kernel arg setup: found mov r8, cb1[7] ; Kernel arg setup: table mov r9, cb1[8] ; Kernel arg setup: singlehash mov r10, cb1[9] call 1028 ; sha1_short ret endfunc ; __OpenCL_sha1_short_kernel ;ARGSTART:__OpenCL_sha1_short_kernel ;version:2:0:68 ;device:redwood ;uniqueid:1024 ;memory:hwprivate:0 ;memory:hwregion:0 ;memory:hwlocal:0 ;cws:64:1:1 ;pointer:dst:i32:1:1:0:uav:11:32 ;value:input:i32:4:1:16 ;value:size:i32:1:1:32 ;value:chbase:i32:8:1:48 ;pointer:found_ind:i32:1:1:80:uav:11:8 ;pointer:bitmaps:i32:1:1:96:uav:11:8 ;pointer:found:i32:1:1:112:uav:11:8 ;pointer:table:i32:1:1:128:uav:11:8 ;value:singlehash:i32:4:1:144 ;function:1:1028 ;uavid:11 ;ARGEND:__OpenCL_sha1_short_kernel func 1028 ; sha1_short ; @__OpenCL_sha1_short_kernel ; BB#0: ; %entry mov r254, r10 mov r256, r8 mov r257, r7 mov r258, r6 mov r259, r5 mov r260, r4 mov r261, r3 mov r262, r2 mov r263, r1 mov r264, l10.xxxx ishl r264.x___, r261.xxxx, r264.xxxx mov r265, l11.xxxx iadd r264.x___, r264.xxxx, r265.xxxx mov r265, r264.xxxx iadd r265, r265.xyz0, r264.000x iadd r265, r265.xy0w, r264.00x0 iadd r264, r265.x0zw, r264.0x00 mov r265, r1021.xyz0 mov r265, r265.x000 mov r266, l12.xxxx ishl r265.x___, r265.xxxx, r266.xxxx iadd r255.x___, r9.xxxx, r265.xxxx mov r1010.x___, r255.xxxx uav_raw_load_id(11)_cached r1011.x___, r1010.xxxx mov r255.x___, r1011.xxxx mov r265, l13.xxxx ubit_extract r267.x___, r265.xxxx, r265.xxxx, r255.xxxx mov r268, l14.xxxx ubit_extract r269.x___, r265.xxxx, r268.xxxx, r255.xxxx mov r270, l15.xxxx ilt r270.x___, r270.xxxx, r261.xxxx mov r271, l16.xxxx ushr r271.x___, r255.xxxx, r271.xxxx ushr r268.x___, r255.xxxx, r268.xxxx ushr r272.x___, r255.xxxx, r265.xxxx mov r273, l17.xxxx iand r273.x___, r255.xxxx, r273.xxxx mov r274, r262.w000 mov r275, r274.xxxx iadd r275, r275.xyz0, r274.000x iadd r275, r275.xy0w, r274.00x0 iadd r274, r275.x0zw, r274.0x00 mov r275, r262.z000 mov r276, r275.xxxx iadd r276, r276.xyz0, r275.000x iadd r276, r276.xy0w, r275.00x0 iadd r275, r276.x0zw, r275.0x00 mov r276, r262.y000 mov r277, r276.xxxx iadd r277, r277.xyz0, r276.000x iadd r277, r277.xy0w, r276.00x0 iadd r276, r277.x0zw, r276.0x00 mov r277, r262.x000 mov r278, r277.xxxx iadd r278, r278.xyz0, r277.000x iadd r278, r278.xy0w, r277.00x0 iadd r277, r278.x0zw, r277.0x00 if_logicalnz r270.xxxx ilt r265.x___, r265.xxxx, r261.xxxx if_logicalnz r265.xxxx mov r265, l18.xxxx ieq r265.x___, r261.xxxx, r265.xxxx if_logicalnz r265.xxxx mov r268, l14.xxxx ishl r268.x___, r267.xxxx, r268.xxxx mov r269, r268.xxxx iadd r269, r269.xyz0, r268.000x iadd r269, r269.xy0w, r268.00x0 iadd r268, r269.x0zw, r268.0x00 mov r269, l13.xxxx ishl r272.x___, r273.xxxx, r269.xxxx mov r273, r272.xxxx iadd r273, r273.xyz0, r272.000x iadd r273, r273.xy0w, r272.00x0 iadd r272, r273.x0zw, r272.0x00 ior r260, r272, r260 ior r260, r260, r268 ishl r268.x___, r255.xxxx, r269.xxxx mov r269, l19.xxxx iand r268.x___, r268.xxxx, r269.xxxx mov r269, r268.xxxx iadd r269, r269.xyz0, r268.000x iadd r269, r269.xy0w, r268.00x0 iadd r268, r269.x0zw, r268.0x00 ior r275, r260, r268 mov r260, l20.xxxx ior r260.x___, r271.xxxx, r260.xxxx mov r268, r260.xxxx iadd r268, r268.xyz0, r260.000x iadd r268, r268.xy0w, r260.00x0 iadd r274, r268.x0zw, r260.0x00 else mov r267, l21.xxxx ieq r267.x___, r261.xxxx, r267.xxxx if_logicalnz r267.xxxx mov r269, l14.xxxx ishl r272.x___, r255.xxxx, r269.xxxx mov r274, l19.xxxx iand r272.x___, r272.xxxx, r274.xxxx mov r274, r272.xxxx iadd r274, r274.xyz0, r272.000x iadd r274, r274.xy0w, r272.00x0 iadd r272, r274.x0zw, r272.0x00 ishl r269.x___, r273.xxxx, r269.xxxx mov r274, r269.xxxx iadd r274, r274.xyz0, r269.000x iadd r274, r274.xy0w, r269.00x0 iadd r269, r274.x0zw, r269.0x00 mov r274, l13.xxxx mov r265, r274.xxxx iadd r265, r265.xyz0, r274.000x iadd r265, r265.xy0w, r274.00x0 iadd r274, r265.x0zw, r274.0x00 ishl r260, r260, r274 ior r260, r260, r269 ior r260, r260, r272 ior r275, r275, r260 mov r260, l22.xxxx ior r260.x___, r268.xxxx, r260.xxxx mov r269, r260.xxxx iadd r269, r269.xyz0, r260.000x iadd r269, r269.xy0w, r260.00x0 iadd r274, r269.x0zw, r260.0x00 else mov r268, l23.xxxx ieq r268.x___, r261.xxxx, r268.xxxx if_logicalnz r268.xxxx mov r274, l13.xxxx ishl r274.x___, r269.xxxx, r274.xxxx mov r265, l24.xxxx iand r265.x___, r272.xxxx, r265.xxxx ior r274.x___, r265.xxxx, r274.xxxx mov r265, l25.xxxx ior r274.x___, r274.xxxx, r265.xxxx mov r265, r274.xxxx iadd r265, r265.xyz0, r274.000x iadd r265, r265.xy0w, r274.00x0 iadd r274, r265.x0zw, r274.0x00 mov r265, l16.xxxx ishl r265.x___, r255.xxxx, r265.xxxx mov r266, r265.xxxx iadd r266, r266.xyz0, r265.000x iadd r266, r266.xy0w, r265.00x0 iadd r265, r266.x0zw, r265.0x00 mov r266, l14.xxxx mov r267, r266.xxxx iadd r267, r267.xyz0, r266.000x iadd r267, r267.xy0w, r266.00x0 iadd r266, r267.x0zw, r266.0x00 ishl r260, r260, r266 ior r260, r260, r265 ior r275, r275, r260 else endif endif endif else mov r271, l26.xxxx ieq r271.x___, r261.xxxx, r271.xxxx if_logicalnz r271.xxxx mov r267, l14.xxxx ishl r269.x___, r255.xxxx, r267.xxxx mov r271, l19.xxxx iand r269.x___, r269.xxxx, r271.xxxx mov r271, r269.xxxx iadd r271, r271.xyz0, r269.000x iadd r271, r271.xy0w, r269.00x0 iadd r269, r271.x0zw, r269.0x00 ishl r267.x___, r273.xxxx, r267.xxxx mov r271, r267.xxxx iadd r271, r271.xyz0, r267.000x iadd r271, r271.xy0w, r267.00x0 iadd r267, r271.x0zw, r267.0x00 mov r271, l13.xxxx mov r272, r271.xxxx iadd r272, r272.xyz0, r271.000x iadd r272, r272.xy0w, r271.00x0 iadd r271, r272.x0zw, r271.0x00 ishl r260, r260, r271 ior r260, r260, r267 ior r260, r260, r269 ior r276, r276, r260 mov r260, l22.xxxx ior r268.x___, r268.xxxx, r260.xxxx mov r260, r268.xxxx iadd r260, r260.xyz0, r268.000x iadd r260, r260.xy0w, r268.00x0 iadd r275, r260.x0zw, r268.0x00 else mov r268, l27.xxxx ieq r268.x___, r261.xxxx, r268.xxxx if_logicalnz r268.xxxx mov r268, l13.xxxx ishl r268.x___, r269.xxxx, r268.xxxx mov r267, l24.xxxx iand r267.x___, r272.xxxx, r267.xxxx ior r268.x___, r267.xxxx, r268.xxxx mov r267, l25.xxxx ior r268.x___, r268.xxxx, r267.xxxx mov r267, r268.xxxx iadd r267, r267.xyz0, r268.000x iadd r267, r267.xy0w, r268.00x0 iadd r275, r267.x0zw, r268.0x00 mov r268, l16.xxxx ishl r268.x___, r255.xxxx, r268.xxxx mov r267, r268.xxxx iadd r267, r267.xyz0, r268.000x iadd r267, r267.xy0w, r268.00x0 iadd r268, r267.x0zw, r268.0x00 mov r267, l14.xxxx mov r269, r267.xxxx iadd r269, r269.xyz0, r267.000x iadd r269, r269.xy0w, r267.00x0 iadd r267, r269.x0zw, r267.0x00 ishl r260, r260, r267 ior r268, r260, r268 ior r276, r276, r268 else mov r268, l13.xxxx ieq r271.x___, r261.xxxx, r268.xxxx if_logicalnz r271.xxxx ishl r267.x___, r267.xxxx, r268.xxxx mov r268, l28.xxxx iand r268.x___, r255.xxxx, r268.xxxx ior r267.x___, r268.xxxx, r267.xxxx mov r268, l14.xxxx ishl r268.x___, r269.xxxx, r268.xxxx ior r267.x___, r267.xxxx, r268.xxxx mov r268, r267.xxxx iadd r268, r268.xyz0, r267.000x iadd r268, r268.xy0w, r267.00x0 iadd r275, r268.x0zw, r267.0x00 mov r267, l29.xxxx mov r268, r267.xxxx iadd r268, r268.xyz0, r267.000x iadd r268, r268.xy0w, r267.00x0 iadd r274, r268.x0zw, r267.0x00 mov r267, l16.xxxx mov r268, r267.xxxx iadd r268, r268.xyz0, r267.000x iadd r268, r268.xy0w, r267.00x0 iadd r267, r268.x0zw, r267.0x00 ishl r260, r260, r267 ior r276, r276, r260 else endif endif endif endif else ilt r265.x___, r266.xxxx, r261.xxxx if_logicalnz r265.xxxx mov r265, l10.xxxx ieq r265.x___, r261.xxxx, r265.xxxx if_logicalnz r265.xxxx mov r268, l13.xxxx ishl r268.x___, r269.xxxx, r268.xxxx mov r267, l24.xxxx iand r267.x___, r272.xxxx, r267.xxxx ior r268.x___, r267.xxxx, r268.xxxx mov r267, l25.xxxx ior r268.x___, r268.xxxx, r267.xxxx mov r267, r268.xxxx iadd r267, r267.xyz0, r268.000x iadd r267, r267.xy0w, r268.00x0 iadd r276, r267.x0zw, r268.0x00 mov r268, l16.xxxx ishl r268.x___, r255.xxxx, r268.xxxx mov r267, r268.xxxx iadd r267, r267.xyz0, r268.000x iadd r267, r267.xy0w, r268.00x0 iadd r268, r267.x0zw, r268.0x00 mov r267, l14.xxxx mov r269, r267.xxxx iadd r269, r269.xyz0, r267.000x iadd r269, r269.xy0w, r267.00x0 iadd r267, r269.x0zw, r267.0x00 ishl r260, r260, r267 ior r268, r260, r268 ior r277, r277, r268 else mov r265, l30.xxxx ieq r265.x___, r261.xxxx, r265.xxxx if_logicalnz r265.xxxx mov r268, l13.xxxx ishl r268.x___, r267.xxxx, r268.xxxx mov r267, l28.xxxx iand r267.x___, r255.xxxx, r267.xxxx ior r268.x___, r267.xxxx, r268.xxxx mov r267, l14.xxxx ishl r267.x___, r269.xxxx, r267.xxxx ior r268.x___, r268.xxxx, r267.xxxx mov r267, r268.xxxx iadd r267, r267.xyz0, r268.000x iadd r267, r267.xy0w, r268.00x0 iadd r276, r267.x0zw, r268.0x00 mov r268, l29.xxxx mov r267, r268.xxxx iadd r267, r267.xyz0, r268.000x iadd r267, r267.xy0w, r268.00x0 iadd r275, r267.x0zw, r268.0x00 mov r268, l16.xxxx mov r267, r268.xxxx iadd r267, r267.xyz0, r268.000x iadd r267, r267.xy0w, r268.00x0 iadd r268, r267.x0zw, r268.0x00 ishl r268, r260, r268 ior r277, r277, r268 else mov r265, l15.xxxx ieq r265.x___, r261.xxxx, r265.xxxx if_logicalnz r265.xxxx mov r268, l14.xxxx ishl r268.x___, r267.xxxx, r268.xxxx mov r267, r268.xxxx iadd r267, r267.xyz0, r268.000x iadd r267, r267.xy0w, r268.00x0 iadd r268, r267.x0zw, r268.0x00 mov r267, l13.xxxx ishl r269.x___, r273.xxxx, r267.xxxx mov r272, r269.xxxx iadd r272, r272.xyz0, r269.000x iadd r272, r272.xy0w, r269.00x0 iadd r269, r272.x0zw, r269.0x00 ior r260, r269, r260 ior r268, r260, r268 ishl r260.x___, r255.xxxx, r267.xxxx mov r267, l19.xxxx iand r260.x___, r260.xxxx, r267.xxxx mov r267, r260.xxxx iadd r267, r267.xyz0, r260.000x iadd r267, r267.xy0w, r260.00x0 iadd r260, r267.x0zw, r260.0x00 ior r276, r268, r260 mov r268, l20.xxxx ior r268.x___, r271.xxxx, r268.xxxx mov r260, r268.xxxx iadd r260, r260.xyz0, r268.000x iadd r260, r260.xy0w, r268.00x0 iadd r275, r260.x0zw, r268.0x00 else endif endif endif else mov r265, l31.xxxx ieq r265.x___, r261.xxxx, r265.xxxx if_logicalnz r265.xxxx mov r268, l14.xxxx ishl r268.x___, r267.xxxx, r268.xxxx mov r267, r268.xxxx iadd r267, r267.xyz0, r268.000x iadd r267, r267.xy0w, r268.00x0 iadd r268, r267.x0zw, r268.0x00 mov r267, l13.xxxx ishl r269.x___, r273.xxxx, r267.xxxx mov r272, r269.xxxx iadd r272, r272.xyz0, r269.000x iadd r272, r272.xy0w, r269.00x0 iadd r269, r272.x0zw, r269.0x00 ior r260, r269, r260 ior r268, r260, r268 ishl r260.x___, r255.xxxx, r267.xxxx mov r267, l19.xxxx iand r260.x___, r260.xxxx, r267.xxxx mov r267, r260.xxxx iadd r267, r267.xyz0, r260.000x iadd r267, r267.xy0w, r260.00x0 iadd r260, r267.x0zw, r260.0x00 ior r277, r268, r260 mov r268, l20.xxxx ior r268.x___, r271.xxxx, r268.xxxx mov r260, r268.xxxx iadd r260, r260.xyz0, r268.000x iadd r260, r260.xy0w, r268.00x0 iadd r276, r260.x0zw, r268.0x00 else mov r265, l12.xxxx ieq r265.x___, r261.xxxx, r265.xxxx if_logicalnz r265.xxxx mov r267, l14.xxxx ishl r269.x___, r255.xxxx, r267.xxxx mov r271, l19.xxxx iand r269.x___, r269.xxxx, r271.xxxx mov r271, r269.xxxx iadd r271, r271.xyz0, r269.000x iadd r271, r271.xy0w, r269.00x0 iadd r269, r271.x0zw, r269.0x00 ishl r267.x___, r273.xxxx, r267.xxxx mov r271, r267.xxxx iadd r271, r271.xyz0, r267.000x iadd r271, r271.xy0w, r267.00x0 iadd r267, r271.x0zw, r267.0x00 mov r271, l13.xxxx mov r272, r271.xxxx iadd r272, r272.xyz0, r271.000x iadd r272, r272.xy0w, r271.00x0 iadd r271, r272.x0zw, r271.0x00 ishl r260, r260, r271 ior r260, r260, r267 ior r260, r260, r269 ior r277, r277, r260 mov r260, l22.xxxx ior r268.x___, r268.xxxx, r260.xxxx mov r260, r268.xxxx iadd r260, r260.xyz0, r268.000x iadd r260, r260.xy0w, r268.00x0 iadd r276, r260.x0zw, r268.0x00 else endif endif endif endif mov r268, l24.xxxx mov r260, r268.xxxx iadd r260, r260.xyz0, r268.000x iadd r260, r260.xy0w, r268.00x0 iadd r268, r260.x0zw, r268.0x00 mov r260, l16.xxxx mov r267, r260.xxxx iadd r267, r267.xyz0, r260.000x iadd r267, r267.xy0w, r260.00x0 iadd r260, r267.x0zw, r260.0x00 bitalign r267, r277, r277, r260 iand r267, r267, r268 mov r269, l32.xxxx mov r271, r269.xxxx iadd r271, r271.xyz0, r269.000x iadd r271, r271.xy0w, r269.00x0 iadd r269, r271.x0zw, r269.0x00 mov r271, l13.xxxx mov r272, r271.xxxx iadd r272, r272.xyz0, r271.000x iadd r272, r272.xy0w, r271.00x0 iadd r271, r272.x0zw, r271.0x00 bitalign r272, r277, r277, r271 iand r272, r272, r269 ior r267, r267, r272 mov r272, l33.xxxx mov r273, r272.xxxx iadd r273, r273.xyz0, r272.000x iadd r273, r273.xy0w, r272.00x0 iadd r272, r273.x0zw, r272.0x00 mov r273, l34.xxxx mov r277, r273.xxxx iadd r277, r277.xyz0, r273.000x iadd r277, r277.xy0w, r273.00x0 iadd r273, r277.x0zw, r273.0x00 bitalign r277, r273, r273, r272 mov r265, l35.xxxx mov r266, r265.xxxx iadd r266, r266.xyz0, r265.000x iadd r266, r266.xy0w, r265.00x0 iadd r265, r266.x0zw, r265.0x00 iadd r277, r277, r265 iadd r277, r277, r267 mov r265, l36.xxxx mov r266, r265.xxxx iadd r266, r266.xyz0, r265.000x iadd r266, r266.xy0w, r265.00x0 iadd r265, r266.x0zw, r265.0x00 iadd r277, r277, r265 mov r266, l12.xxxx mov r270, r266.xxxx iadd r270, r270.xyz0, r266.000x iadd r270, r270.xy0w, r266.00x0 iadd r266, r270.x0zw, r266.0x00 mov r270, l37.xxxx mov r278, r270.xxxx iadd r278, r278.xyz0, r270.000x iadd r278, r278.xy0w, r270.00x0 iadd r270, r278.x0zw, r270.0x00 bitalign r270, r270, r270, r266 bitalign r278, r276, r276, r260 bitalign r276, r276, r276, r271 bitalign r279, r277, r277, r272 bitalign r273, r273, r273, r266 ixor r280, r273, r270 iand r280, r280, r277 ixor r280, r280, r270 iand r278, r278, r268 iand r276, r276, r269 ior r276, r278, r276 mov r278, l38.xxxx mov r281, r278.xxxx iadd r281, r281.xyz0, r278.000x iadd r281, r281.xy0w, r278.00x0 iadd r278, r281.x0zw, r278.0x00 iadd r279, r279, r278 mov r281, l39.xxxx mov r282, r281.xxxx iadd r282, r282.xyz0, r281.000x iadd r282, r282.xy0w, r281.00x0 iadd r281, r282.x0zw, r281.0x00 ior r282, r270, r281 iadd r279, r279, r282 iadd r279, r279, r276 iadd r279, r279, r265 bitalign r282, r275, r275, r260 bitalign r275, r275, r275, r271 bitalign r283, r279, r279, r272 iadd r283, r283, r281 iadd r280, r283, r280 iand r282, r282, r268 iand r275, r275, r269 ior r275, r282, r275 iadd r280, r280, r275 iadd r280, r280, r265 bitalign r277, r277, r277, r266 bitalign r282, r274, r274, r260 bitalign r274, r274, r274, r271 bitalign r283, r280, r280, r272 bitalign r284, r279, r279, r266 ixor r285, r284, r277 iand r285, r285, r280 ixor r285, r285, r277 iadd r270, r270, r283 ixor r283, r277, r273 iand r279, r283, r279 ixor r279, r279, r273 iadd r270, r270, r279 iand r279, r282, r268 iand r274, r274, r269 ior r274, r279, r274 iadd r270, r270, r274 iadd r270, r270, r265 bitalign r279, r270, r270, r272 iadd r273, r273, r279 iadd r273, r273, r285 iadd r273, r273, r265 bitalign r279, r280, r280, r266 bitalign r280, r273, r273, r272 bitalign r282, r270, r270, r266 ixor r283, r282, r279 iand r283, r283, r273 ixor r283, r283, r279 ixor r285, r279, r284 iand r270, r285, r270 ixor r270, r270, r284 iadd r277, r277, r280 iadd r277, r277, r270 iadd r277, r277, r265 bitalign r270, r277, r277, r272 iadd r270, r284, r270 iadd r270, r270, r283 iadd r270, r270, r265 bitalign r273, r273, r273, r266 bitalign r280, r270, r270, r272 bitalign r283, r277, r277, r266 ixor r284, r283, r273 iand r284, r284, r270 ixor r284, r284, r273 iadd r279, r279, r280 ixor r280, r273, r282 iand r277, r280, r277 ixor r277, r277, r282 iadd r277, r279, r277 iadd r277, r277, r265 bitalign r279, r277, r277, r272 iadd r279, r282, r279 iadd r279, r279, r284 iadd r279, r279, r265 bitalign r270, r270, r270, r266 bitalign r280, r279, r279, r272 bitalign r282, r277, r277, r266 ixor r284, r282, r270 iand r284, r284, r279 ixor r284, r284, r270 ixor r285, r270, r283 iand r277, r285, r277 ixor r277, r277, r283 iadd r273, r273, r280 iadd r273, r273, r277 iadd r273, r273, r265 bitalign r277, r273, r273, r272 iadd r277, r283, r277 iadd r277, r277, r284 iadd r277, r277, r265 bitalign r279, r279, r279, r266 bitalign r280, r277, r277, r272 bitalign r283, r273, r273, r266 ixor r284, r283, r279 iand r284, r284, r277 ixor r284, r284, r279 iadd r270, r270, r280 ixor r280, r279, r282 iand r273, r280, r273 ixor r273, r273, r282 iadd r273, r270, r273 iadd r273, r273, r265 bitalign r270, r273, r273, r272 iadd r270, r282, r270 iadd r270, r270, r284 iadd r270, r270, r265 bitalign r277, r277, r277, r266 bitalign r280, r270, r270, r272 bitalign r282, r273, r273, r266 ixor r284, r282, r277 iand r284, r284, r270 ixor r284, r284, r277 ixor r285, r277, r283 iand r273, r285, r273 ixor r273, r273, r283 iadd r279, r279, r280 iadd r273, r279, r273 iadd r273, r273, r265 bitalign r279, r273, r273, r272 iadd r279, r283, r279 iadd r279, r279, r284 iadd r279, r279, r265 bitalign r270, r270, r270, r266 bitalign r280, r279, r279, r272 bitalign r283, r273, r273, r266 ixor r284, r283, r270 iand r284, r284, r279 ixor r284, r284, r270 iadd r277, r277, r280 ixor r280, r270, r282 iand r273, r280, r273 ixor r273, r273, r282 iadd r273, r277, r273 iadd r273, r273, r264 iadd r273, r273, r265 ixor r267, r275, r267 mov r277, l40.xxxx mov r280, r277.xxxx iadd r280, r280.xyz0, r277.000x iadd r280, r280.xy0w, r277.00x0 iadd r280, r280.x0zw, r277.0x00 bitalign r267, r267, r267, r280 bitalign r285, r273, r273, r272 iadd r282, r282, r285 iadd r282, r282, r284 iadd r282, r282, r267 iadd r282, r282, r265 bitalign r279, r279, r279, r266 ixor r276, r274, r276 bitalign r276, r276, r276, r280 bitalign r284, r282, r282, r272 bitalign r285, r273, r273, r266 ixor r286, r285, r279 iand r286, r286, r282 ixor r286, r286, r279 ixor r287, r279, r283 iand r273, r287, r273 ixor r273, r273, r283 iadd r270, r270, r284 iadd r273, r270, r273 iadd r273, r273, r276 iadd r273, r273, r265 ixor r275, r264, r275 bitalign r275, r275, r275, r280 bitalign r270, r273, r273, r272 iadd r270, r283, r270 iadd r270, r270, r286 iadd r270, r270, r275 iadd r270, r270, r265 bitalign r282, r282, r282, r266 ixor r274, r267, r274 bitalign r274, r274, r274, r280 bitalign r283, r270, r270, r272 iadd r279, r279, r283 ixor r283, r282, r285 iand r283, r283, r273 ixor r283, r283, r285 iadd r279, r279, r283 iadd r279, r279, r274 iadd r265, r279, r265 bitalign r273, r273, r273, r266 bitalign r279, r276, r276, r280 bitalign r283, r265, r265, r272 bitalign r284, r270, r270, r266 ixor r286, r265, r284 ixor r286, r286, r273 ixor r270, r270, r273 ixor r270, r270, r282 iadd r283, r285, r283 iadd r270, r283, r270 iadd r270, r270, r279 mov r283, l41.xxxx mov r285, r283.xxxx iadd r285, r285.xyz0, r283.000x iadd r285, r285.xy0w, r283.00x0 iadd r283, r285.x0zw, r283.0x00 iadd r270, r270, r283 bitalign r285, r275, r275, r280 bitalign r287, r270, r270, r272 iadd r282, r282, r287 iadd r282, r282, r286 iadd r282, r282, r285 iadd r282, r282, r283 bitalign r265, r265, r265, r266 bitalign r286, r274, r274, r280 bitalign r287, r282, r282, r272 iadd r273, r273, r287 ixor r287, r270, r265 ixor r287, r287, r284 iadd r273, r273, r287 iadd r273, r273, r286 iadd r273, r273, r283 bitalign r270, r270, r270, r266 ixor r287, r279, r264 bitalign r287, r287, r287, r280 bitalign r288, r273, r273, r272 bitalign r289, r282, r282, r266 ixor r290, r273, r289 ixor r290, r290, r270 ixor r282, r282, r270 ixor r282, r282, r265 iadd r284, r284, r288 iadd r282, r284, r282 iadd r282, r282, r287 iadd r282, r282, r283 ixor r284, r285, r267 bitalign r284, r284, r284, r280 bitalign r288, r282, r282, r272 iadd r265, r265, r288 iadd r265, r265, r290 iadd r265, r265, r284 iadd r265, r265, r283 bitalign r273, r273, r273, r266 ixor r288, r286, r276 bitalign r288, r288, r288, r280 bitalign r290, r265, r265, r272 iadd r270, r270, r290 ixor r290, r282, r273 ixor r290, r290, r289 iadd r270, r270, r290 iadd r270, r270, r288 iadd r270, r270, r283 bitalign r282, r282, r282, r266 ixor r290, r287, r275 bitalign r290, r290, r290, r280 bitalign r291, r270, r270, r272 bitalign r292, r265, r265, r266 ixor r293, r270, r292 ixor r293, r293, r282 ixor r265, r265, r282 ixor r265, r265, r273 iadd r289, r289, r291 iadd r265, r289, r265 iadd r265, r265, r290 iadd r265, r265, r283 ixor r289, r284, r274 bitalign r289, r289, r289, r280 bitalign r291, r265, r265, r272 iadd r273, r273, r291 iadd r273, r273, r293 iadd r273, r273, r289 iadd r273, r273, r283 bitalign r270, r270, r270, r266 ixor r291, r288, r279 bitalign r291, r291, r291, r280 bitalign r293, r273, r273, r272 iadd r282, r282, r293 ixor r293, r265, r270 ixor r293, r293, r292 iadd r282, r282, r293 iadd r282, r282, r291 iadd r282, r282, r283 bitalign r265, r265, r265, r266 ixor r293, r290, r285 ixor r293, r293, r264 bitalign r293, r293, r293, r280 bitalign r294, r282, r282, r272 bitalign r295, r273, r273, r266 ixor r296, r282, r295 ixor r296, r296, r265 ixor r273, r273, r265 ixor r273, r273, r270 iadd r292, r292, r294 iadd r273, r292, r273 iadd r273, r273, r293 iadd r273, r273, r283 ixor r292, r289, r286 ixor r292, r292, r267 bitalign r292, r292, r292, r280 bitalign r294, r273, r273, r272 iadd r270, r270, r294 iadd r270, r270, r296 iadd r270, r270, r292 iadd r270, r270, r283 bitalign r282, r282, r282, r266 ixor r294, r291, r287 ixor r294, r294, r276 ixor r264, r294, r264 bitalign r264, r264, r264, r280 bitalign r294, r270, r270, r272 iadd r265, r265, r294 ixor r294, r273, r282 ixor r294, r294, r295 iadd r265, r265, r294 iadd r265, r265, r264 iadd r265, r265, r283 bitalign r273, r273, r273, r266 ixor r294, r293, r284 ixor r294, r294, r275 ixor r267, r294, r267 bitalign r267, r267, r267, r280 bitalign r294, r265, r265, r272 bitalign r296, r270, r270, r266 ixor r297, r265, r296 ixor r297, r297, r273 ixor r270, r270, r273 ixor r270, r270, r282 iadd r294, r295, r294 iadd r270, r294, r270 iadd r270, r270, r267 iadd r270, r270, r283 ixor r294, r292, r288 ixor r294, r294, r274 ixor r276, r294, r276 bitalign r276, r276, r276, r280 bitalign r294, r270, r270, r272 iadd r282, r282, r294 iadd r282, r282, r297 iadd r282, r282, r276 iadd r282, r282, r283 bitalign r265, r265, r265, r266 ixor r294, r264, r290 ixor r294, r294, r279 ixor r275, r294, r275 bitalign r275, r275, r275, r280 bitalign r294, r282, r282, r272 iadd r273, r273, r294 ixor r294, r270, r265 ixor r294, r294, r296 iadd r273, r273, r294 iadd r273, r273, r275 iadd r273, r273, r283 bitalign r270, r270, r270, r266 ixor r294, r267, r289 ixor r294, r294, r285 ixor r274, r294, r274 bitalign r274, r274, r274, r280 bitalign r294, r273, r273, r272 bitalign r295, r282, r282, r266 ixor r297, r273, r295 ixor r297, r297, r270 ixor r282, r282, r270 ixor r282, r282, r265 iadd r294, r296, r294 iadd r282, r294, r282 iadd r282, r282, r274 iadd r282, r282, r283 ixor r294, r276, r291 ixor r294, r294, r286 ixor r279, r294, r279 bitalign r279, r279, r279, r280 bitalign r294, r282, r282, r272 iadd r265, r265, r294 iadd r265, r265, r297 iadd r265, r265, r279 iadd r265, r265, r283 bitalign r273, r273, r273, r266 ixor r294, r275, r293 ixor r294, r294, r287 ixor r285, r294, r285 bitalign r285, r285, r285, r280 bitalign r294, r265, r265, r272 iadd r270, r270, r294 ixor r294, r282, r273 ixor r294, r294, r295 iadd r270, r270, r294 iadd r270, r270, r285 iadd r270, r270, r283 bitalign r282, r282, r282, r266 ixor r294, r274, r292 ixor r294, r294, r284 ixor r286, r294, r286 bitalign r286, r286, r286, r280 bitalign r294, r270, r270, r272 bitalign r296, r265, r265, r266 ixor r297, r270, r296 ixor r297, r297, r282 ixor r265, r265, r282 ixor r265, r265, r273 iadd r294, r295, r294 iadd r265, r294, r265 iadd r265, r265, r286 iadd r265, r265, r283 ixor r294, r279, r264 ixor r294, r294, r288 ixor r287, r294, r287 bitalign r287, r287, r287, r280 bitalign r294, r265, r265, r272 iadd r273, r273, r294 iadd r273, r273, r297 iadd r273, r273, r287 iadd r273, r273, r283 bitalign r270, r270, r270, r266 ixor r283, r285, r267 ixor r283, r283, r290 ixor r283, r283, r284 bitalign r283, r283, r283, r280 bitalign r284, r273, r273, r272 bitalign r294, r265, r265, r266 iand r295, r273, r294 ior r297, r273, r294 iand r297, r297, r270 ior r295, r295, r297 iand r297, r265, r270 ior r265, r265, r270 iand r265, r265, r296 ior r265, r297, r265 iadd r282, r282, r284 iadd r265, r282, r265 iadd r265, r265, r283 mov r282, l42.xxxx mov r284, r282.xxxx iadd r284, r284.xyz0, r282.000x iadd r284, r284.xy0w, r282.00x0 iadd r282, r284.x0zw, r282.0x00 iadd r265, r265, r282 ixor r284, r286, r276 ixor r284, r284, r289 ixor r284, r284, r288 bitalign r284, r284, r284, r280 bitalign r288, r265, r265, r272 iadd r288, r296, r288 iadd r288, r288, r295 iadd r288, r288, r284 iadd r288, r288, r282 bitalign r273, r273, r273, r266 ixor r295, r287, r275 ixor r295, r295, r291 ixor r290, r295, r290 bitalign r290, r290, r290, r280 bitalign r295, r288, r288, r272 bitalign r296, r265, r265, r266 iand r297, r288, r296 ior r298, r288, r296 iand r298, r298, r273 ior r297, r297, r298 iand r298, r265, r273 ior r265, r265, r273 iand r265, r265, r294 ior r265, r298, r265 iadd r270, r270, r295 iadd r265, r270, r265 iadd r265, r265, r290 iadd r265, r265, r282 ixor r270, r283, r274 ixor r270, r270, r293 ixor r270, r270, r289 bitalign r270, r270, r270, r280 bitalign r289, r265, r265, r272 iadd r289, r294, r289 iadd r289, r289, r297 iadd r289, r289, r270 iadd r289, r289, r282 bitalign r288, r288, r288, r266 ixor r294, r284, r279 ixor r294, r294, r292 ixor r291, r294, r291 bitalign r291, r291, r291, r280 bitalign r294, r289, r289, r272 bitalign r295, r265, r265, r266 iand r297, r289, r295 ior r298, r289, r295 iand r298, r298, r288 ior r297, r297, r298 iand r298, r265, r288 ior r265, r265, r288 iand r265, r265, r296 ior r265, r298, r265 iadd r273, r273, r294 iadd r273, r273, r265 iadd r273, r273, r291 iadd r273, r273, r282 ixor r265, r290, r285 ixor r265, r265, r264 ixor r265, r265, r293 bitalign r265, r265, r265, r280 bitalign r293, r273, r273, r272 iadd r293, r296, r293 iadd r293, r293, r297 iadd r293, r293, r265 iadd r293, r293, r282 bitalign r289, r289, r289, r266 ixor r294, r270, r286 ixor r294, r294, r267 ixor r292, r294, r292 bitalign r292, r292, r292, r280 bitalign r294, r293, r293, r272 bitalign r296, r273, r273, r266 iand r297, r293, r296 ior r298, r293, r296 iand r298, r298, r289 ior r297, r297, r298 iand r298, r273, r289 ior r273, r273, r289 iand r273, r273, r295 ior r273, r298, r273 iadd r288, r288, r294 iadd r273, r288, r273 iadd r273, r273, r292 iadd r273, r273, r282 ixor r288, r291, r287 ixor r288, r288, r276 ixor r264, r288, r264 bitalign r264, r264, r264, r280 bitalign r288, r273, r273, r272 iadd r288, r295, r288 iadd r288, r288, r297 iadd r288, r288, r264 iadd r288, r288, r282 bitalign r293, r293, r293, r266 ixor r294, r265, r283 ixor r294, r294, r275 ixor r267, r294, r267 bitalign r267, r267, r267, r280 bitalign r294, r288, r288, r272 bitalign r295, r273, r273, r266 iand r297, r288, r295 ior r298, r288, r295 iand r298, r298, r293 ior r297, r297, r298 iand r298, r273, r293 ior r273, r273, r293 iand r273, r273, r296 ior r273, r298, r273 iadd r289, r289, r294 iadd r273, r289, r273 iadd r273, r273, r267 iadd r273, r273, r282 ixor r289, r292, r284 ixor r289, r289, r274 ixor r276, r289, r276 bitalign r276, r276, r276, r280 bitalign r289, r273, r273, r272 iadd r289, r296, r289 iadd r289, r289, r297 iadd r289, r289, r276 iadd r289, r289, r282 bitalign r288, r288, r288, r266 ixor r294, r264, r290 ixor r294, r294, r279 ixor r275, r294, r275 bitalign r275, r275, r275, r280 bitalign r294, r289, r289, r272 bitalign r296, r273, r273, r266 iand r297, r289, r296 ior r298, r289, r296 iand r298, r298, r288 ior r297, r297, r298 iand r298, r273, r288 ior r273, r273, r288 iand r273, r273, r295 ior r273, r298, r273 iadd r293, r293, r294 iadd r273, r293, r273 iadd r273, r273, r275 iadd r273, r273, r282 ixor r293, r267, r270 ixor r293, r293, r285 ixor r274, r293, r274 bitalign r274, r274, r274, r280 bitalign r293, r273, r273, r272 iadd r293, r295, r293 iadd r293, r293, r297 iadd r293, r293, r274 iadd r293, r293, r282 bitalign r289, r289, r289, r266 ixor r294, r276, r291 ixor r294, r294, r286 ixor r279, r294, r279 bitalign r279, r279, r279, r280 bitalign r294, r293, r293, r272 bitalign r295, r273, r273, r266 iand r297, r293, r295 ior r298, r293, r295 iand r298, r298, r289 ior r297, r297, r298 iand r298, r273, r289 ior r273, r273, r289 iand r273, r273, r296 ior r273, r298, r273 iadd r288, r288, r294 iadd r273, r288, r273 iadd r273, r273, r279 iadd r273, r273, r282 ixor r288, r275, r265 ixor r288, r288, r287 ixor r285, r288, r285 bitalign r285, r285, r285, r280 bitalign r288, r273, r273, r272 iadd r288, r296, r288 iadd r288, r288, r297 iadd r288, r288, r285 iadd r288, r288, r282 bitalign r293, r293, r293, r266 ixor r294, r274, r292 ixor r294, r294, r283 ixor r286, r294, r286 bitalign r286, r286, r286, r280 bitalign r294, r288, r288, r272 bitalign r296, r273, r273, r266 iand r297, r288, r296 ior r298, r288, r296 iand r298, r298, r293 ior r297, r297, r298 iand r298, r273, r293 ior r273, r273, r293 iand r273, r273, r295 ior r273, r298, r273 iadd r289, r289, r294 iadd r273, r289, r273 iadd r273, r273, r286 iadd r273, r273, r282 ixor r289, r279, r264 ixor r289, r289, r284 ixor r287, r289, r287 bitalign r287, r287, r287, r280 bitalign r289, r273, r273, r272 iadd r289, r295, r289 iadd r289, r289, r297 iadd r289, r289, r287 iadd r289, r289, r282 bitalign r288, r288, r288, r266 ixor r294, r285, r267 ixor r294, r294, r290 ixor r283, r294, r283 bitalign r283, r283, r283, r280 bitalign r294, r289, r289, r272 bitalign r295, r273, r273, r266 iand r297, r289, r295 ior r298, r289, r295 iand r298, r298, r288 ior r297, r297, r298 iand r298, r273, r288 ior r273, r273, r288 iand r273, r273, r296 ior r273, r298, r273 iadd r293, r293, r294 iadd r273, r293, r273 iadd r273, r273, r283 iadd r273, r273, r282 ixor r293, r286, r276 ixor r293, r293, r270 ixor r284, r293, r284 bitalign r284, r284, r284, r280 bitalign r293, r273, r273, r272 iadd r293, r296, r293 iadd r293, r293, r297 iadd r293, r293, r284 iadd r293, r293, r282 bitalign r289, r289, r289, r266 ixor r294, r287, r275 ixor r294, r294, r291 ixor r290, r294, r290 bitalign r290, r290, r290, r280 bitalign r294, r293, r293, r272 bitalign r296, r273, r273, r266 iand r297, r293, r296 ior r298, r293, r296 iand r298, r298, r289 ior r297, r297, r298 iand r298, r273, r289 ior r273, r273, r289 iand r273, r273, r295 ior r273, r298, r273 iadd r288, r288, r294 iadd r273, r288, r273 iadd r273, r273, r290 iadd r273, r273, r282 ixor r288, r283, r274 ixor r288, r288, r265 ixor r270, r288, r270 bitalign r270, r270, r270, r280 bitalign r288, r273, r273, r272 iadd r288, r295, r288 iadd r288, r288, r297 iadd r288, r288, r270 iadd r282, r288, r282 bitalign r288, r293, r293, r266 ixor r293, r284, r279 ixor r293, r293, r292 ixor r291, r293, r291 bitalign r291, r291, r291, r280 bitalign r293, r282, r282, r272 iadd r289, r289, r293 ixor r293, r273, r288 ixor r293, r293, r296 iadd r289, r289, r293 iadd r289, r289, r291 mov r293, l43.xxxx mov r294, r293.xxxx iadd r294, r294.xyz0, r293.000x iadd r294, r294.xy0w, r293.00x0 iadd r293, r294.x0zw, r293.0x00 iadd r289, r289, r293 bitalign r273, r273, r273, r266 ixor r294, r290, r285 ixor r294, r294, r264 ixor r265, r294, r265 bitalign r265, r265, r265, r280 bitalign r294, r289, r289, r272 bitalign r295, r282, r282, r266 ixor r297, r289, r295 ixor r297, r297, r273 ixor r282, r282, r273 ixor r282, r282, r288 iadd r294, r296, r294 iadd r282, r294, r282 iadd r282, r282, r265 iadd r282, r282, r293 ixor r294, r270, r286 ixor r294, r294, r267 ixor r292, r294, r292 bitalign r292, r292, r292, r280 bitalign r294, r282, r282, r272 iadd r288, r288, r294 iadd r288, r288, r297 iadd r288, r288, r292 iadd r288, r288, r293 bitalign r289, r289, r289, r266 ixor r294, r291, r287 ixor r294, r294, r276 ixor r264, r294, r264 bitalign r264, r264, r264, r280 bitalign r294, r288, r288, r272 iadd r273, r273, r294 ixor r294, r282, r289 ixor r294, r294, r295 iadd r273, r273, r294 iadd r273, r273, r264 iadd r273, r273, r293 bitalign r282, r282, r282, r266 ixor r294, r265, r283 ixor r294, r294, r275 ixor r267, r294, r267 bitalign r267, r267, r267, r280 bitalign r294, r273, r273, r272 bitalign r296, r288, r288, r266 ixor r297, r273, r296 ixor r297, r297, r282 ixor r288, r288, r282 ixor r288, r288, r289 iadd r294, r295, r294 iadd r288, r294, r288 iadd r288, r288, r267 iadd r288, r288, r293 ixor r294, r292, r284 ixor r294, r294, r274 ixor r276, r294, r276 bitalign r276, r276, r276, r280 bitalign r294, r288, r288, r272 iadd r289, r289, r294 iadd r289, r289, r297 iadd r289, r289, r276 iadd r289, r289, r293 bitalign r273, r273, r273, r266 ixor r294, r264, r290 ixor r294, r294, r279 ixor r275, r294, r275 bitalign r275, r275, r275, r280 bitalign r294, r289, r289, r272 iadd r282, r282, r294 ixor r294, r288, r273 ixor r294, r294, r296 iadd r282, r282, r294 iadd r282, r282, r275 iadd r282, r282, r293 bitalign r288, r288, r288, r266 ixor r294, r267, r270 ixor r294, r294, r285 ixor r274, r294, r274 bitalign r274, r274, r274, r280 bitalign r294, r282, r282, r272 bitalign r295, r289, r289, r266 ixor r297, r282, r295 ixor r297, r297, r288 ixor r289, r289, r288 ixor r289, r289, r273 iadd r294, r296, r294 iadd r289, r294, r289 iadd r289, r289, r274 iadd r289, r289, r293 ixor r294, r276, r291 ixor r294, r294, r286 ixor r279, r294, r279 bitalign r279, r279, r279, r280 bitalign r294, r289, r289, r272 iadd r273, r273, r294 iadd r273, r273, r297 iadd r273, r273, r279 iadd r273, r273, r293 bitalign r282, r282, r282, r266 ixor r294, r275, r265 ixor r294, r294, r287 ixor r285, r294, r285 bitalign r285, r285, r285, r280 bitalign r294, r273, r273, r272 iadd r288, r288, r294 ixor r294, r289, r282 ixor r294, r294, r295 iadd r288, r288, r294 iadd r288, r288, r285 iadd r288, r288, r293 bitalign r289, r289, r289, r266 ixor r294, r274, r292 ixor r294, r294, r283 ixor r286, r294, r286 bitalign r286, r286, r286, r280 bitalign r294, r288, r288, r272 bitalign r296, r273, r273, r266 ixor r297, r288, r296 ixor r297, r297, r289 ixor r273, r273, r289 ixor r273, r273, r282 iadd r294, r295, r294 iadd r273, r294, r273 iadd r273, r273, r286 iadd r273, r273, r293 ixor r294, r279, r264 ixor r294, r294, r284 ixor r287, r294, r287 bitalign r287, r287, r287, r280 bitalign r294, r273, r273, r272 iadd r282, r282, r294 iadd r282, r282, r297 iadd r282, r282, r287 iadd r282, r282, r293 bitalign r288, r288, r288, r266 ixor r294, r285, r267 ixor r294, r294, r290 ixor r283, r294, r283 bitalign r283, r283, r283, r280 bitalign r294, r282, r282, r272 iadd r289, r289, r294 ixor r294, r273, r288 ixor r294, r294, r296 iadd r289, r289, r294 iadd r289, r289, r283 iadd r289, r289, r293 bitalign r273, r273, r273, r266 ixor r294, r286, r276 ixor r294, r294, r270 ixor r284, r294, r284 bitalign r284, r284, r284, r280 bitalign r294, r289, r289, r272 bitalign r295, r282, r282, r266 ixor r297, r289, r295 ixor r297, r297, r273 ixor r282, r282, r273 ixor r282, r282, r288 iadd r294, r296, r294 iadd r282, r294, r282 iadd r282, r282, r284 iadd r282, r282, r293 ixor r275, r287, r275 ixor r275, r275, r291 ixor r275, r275, r290 bitalign r275, r275, r275, r280 bitalign r290, r282, r282, r272 iadd r288, r288, r290 iadd r288, r288, r297 iadd r288, r288, r275 iadd r288, r288, r293 bitalign r289, r289, r289, r266 ixor r274, r283, r274 ixor r274, r274, r265 ixor r274, r274, r270 bitalign r274, r274, r274, r280 bitalign r270, r288, r288, r272 iadd r273, r273, r270 ixor r270, r282, r289 ixor r270, r270, r295 iadd r273, r273, r270 iadd r273, r273, r274 iadd r273, r273, r293 bitalign r270, r282, r282, r266 ixor r279, r284, r279 ixor r279, r279, r292 ixor r279, r279, r291 bitalign r279, r279, r279, r280 bitalign r282, r273, r273, r272 bitalign r283, r288, r288, r266 ixor r284, r273, r283 ixor r284, r284, r270 ixor r288, r288, r270 ixor r288, r288, r289 iadd r282, r295, r282 iadd r282, r282, r288 iadd r282, r282, r279 iadd r282, r282, r293 ixor r275, r275, r285 ixor r275, r275, r264 ixor r275, r275, r265 bitalign r275, r275, r275, r280 bitalign r265, r282, r282, r272 iadd r265, r289, r265 iadd r265, r265, r284 iadd r275, r265, r275 iadd r275, r275, r293 bitalign r273, r273, r273, r266 ixor r274, r274, r286 ixor r267, r274, r267 ixor r267, r267, r292 bitalign r267, r267, r267, r280 bitalign r274, r275, r275, r272 bitalign r265, r282, r282, r266 ixor r284, r275, r265 ixor r284, r284, r273 iadd r274, r270, r274 ixor r270, r282, r273 ixor r270, r270, r283 iadd r274, r274, r270 iadd r267, r274, r267 iadd r274, r267, r293 ixor r270, r279, r287 ixor r276, r270, r276 ixor r264, r276, r264 bitalign r264, r264, r264, r280 bitalign r272, r274, r274, r272 iadd r272, r283, r272 iadd r272, r272, r284 iadd r264, r272, r264 mov r272, l44.xxxx mov r274, r272.xxxx iadd r274, r274.xyz0, r272.000x iadd r274, r274.xy0w, r272.00x0 iadd r272, r274.x0zw, r272.0x00 iadd r264, r264, r272 bitalign r272, r275, r275, r266 bitalign r274, r264, r264, r260 bitalign r264, r264, r264, r271 mov r275, l45.xxxx mov r276, r275.xxxx iadd r276, r276.xyz0, r275.000x iadd r276, r276.xy0w, r275.00x0 iadd r275, r276.x0zw, r275.0x00 iadd r267, r267, r275 bitalign r275, r267, r267, r260 iand r275, r275, r268 bitalign r267, r267, r267, r271 iand r267, r267, r269 ior r267, r275, r267 mov r275, r267.x000 iand r276.x___, r275.xxxx, r277.xxxx mov r266, l31.xxxx ishl r276.x___, r266.xxxx, r276.xxxx mov r270, r254.x000 iand r279.x___, r276.xxxx, r270.xxxx mov r280, l46.xxxx ine r279.x___, r279.xxxx, r280.xxxx mov r279.x___, r279.xxxx iadd r272, r272, r281 bitalign r281, r272, r272, r260 bitalign r272, r272, r272, r271 iadd r265, r265, r278 bitalign r278, r265, r265, r260 iand r278, r278, r268 bitalign r265, r265, r265, r271 iand r265, r265, r269 ior r265, r278, r265 mov r278, r265.x000 iand r282.x___, r278.xxxx, r277.xxxx ishl r282.x___, r266.xxxx, r282.xxxx mov r283, r254.z000 iand r284.x___, r282.xxxx, r283.xxxx ine r284.x___, r284.xxxx, r280.xxxx mov r284.x___, r284.xxxx iand r279.x___, r279.xxxx, r284.xxxx iand r281, r281, r268 iand r272, r272, r269 ior r272, r281, r272 mov r281, r272.x000 iand r284.x___, r281.xxxx, r277.xxxx ishl r284.x___, r266.xxxx, r284.xxxx mov r254, r254.y000 iand r285.x___, r284.xxxx, r254.xxxx ine r285.x___, r285.xxxx, r280.xxxx mov r285.x___, r285.xxxx iand r279.x___, r279.xxxx, r285.xxxx iand r279.x___, r279.xxxx, r266.xxxx iand r274, r274, r268 iand r264, r264, r269 ior r264, r274, r264 mov r274, r264.x000 mov r285, l47.xxxx mov r286, r285.xxxx iadd r286, r286.xyz0, r285.000x iadd r286, r286.xy0w, r285.00x0 iadd r285, r286.x0zw, r285.0x00 iadd r273, r273, r285 bitalign r260, r273, r273, r260 bitalign r271, r273, r273, r271 mov r273, l46.xxxx if_logicalnz r279.xxxx iand r273.x___, r274.xxxx, r277.xxxx ishl r273.x___, r266.xxxx, r273.xxxx mov r279, l48.xxxx iand r285.x___, r274.xxxx, r279.xxxx mov r286, l23.xxxx ushr r285.x___, r285.xxxx, r286.xxxx iadd r285.x___, r257.xxxx, r285.xxxx mov r1010.x___, r285.xxxx uav_raw_load_id(11)_cached r1011.x___, r1010.xxxx mov r285.x___, r1011.xxxx iand r285.x___, r285.xxxx, r273.xxxx mov r273, l46.xxxx ieq r285.x___, r285.xxxx, r273.xxxx if_logicalnz r285.xxxx else iand r273.x___, r275.xxxx, r279.xxxx ushr r273.x___, r273.xxxx, r286.xxxx iadd r273.x___, r273.xxxx, r257.xxxx mov r279, l49.xxxx iadd r273.x___, r273.xxxx, r279.xxxx mov r1010.x___, r273.xxxx uav_raw_load_id(11)_cached r1011.x___, r1010.xxxx mov r273.x___, r1011.xxxx iand r276.x___, r273.xxxx, r276.xxxx mov r273, l46.xxxx ieq r276.x___, r276.xxxx, r273.xxxx if_logicalnz r276.xxxx else mov r276, l48.xxxx iand r273.x___, r281.xxxx, r276.xxxx mov r279, l23.xxxx ushr r273.x___, r273.xxxx, r279.xxxx iadd r273.x___, r273.xxxx, r257.xxxx mov r285, l50.xxxx iadd r273.x___, r273.xxxx, r285.xxxx mov r1010.x___, r273.xxxx uav_raw_load_id(11)_cached r1011.x___, r1010.xxxx mov r273.x___, r1011.xxxx iand r285.x___, r273.xxxx, r284.xxxx mov r273, l46.xxxx ieq r285.x___, r285.xxxx, r273.xxxx if_logicalnz r285.xxxx else iand r273.x___, r278.xxxx, r276.xxxx ushr r273.x___, r273.xxxx, r279.xxxx iadd r273.x___, r273.xxxx, r257.xxxx mov r276, l51.xxxx iadd r273.x___, r273.xxxx, r276.xxxx mov r1010.x___, r273.xxxx uav_raw_load_id(11)_cached r1011.x___, r1010.xxxx mov r273.x___, r1011.xxxx iand r276.x___, r273.xxxx, r282.xxxx mov r273, l46.xxxx ieq r276.x___, r276.xxxx, r273.xxxx if_logicalnz r276.xxxx else mov r273, l31.xxxx endif endif endif endif else endif mov r276, r265.y000 iand r282.x___, r276.xxxx, r277.xxxx ishl r282.x___, r266.xxxx, r282.xxxx iand r284.x___, r282.xxxx, r283.xxxx ine r284.x___, r284.xxxx, r280.xxxx mov r284.x___, r284.xxxx mov r279, r267.y000 iand r285.x___, r279.xxxx, r277.xxxx ishl r285.x___, r266.xxxx, r285.xxxx iand r286.x___, r285.xxxx, r270.xxxx ine r286.x___, r286.xxxx, r280.xxxx mov r286.x___, r286.xxxx iand r284.x___, r286.xxxx, r284.xxxx mov r286, r272.y000 iand r277.x___, r286.xxxx, r277.xxxx ishl r277.x___, r266.xxxx, r277.xxxx iand r287.x___, r277.xxxx, r254.xxxx ine r280.x___, r287.xxxx, r280.xxxx mov r280.x___, r280.xxxx iand r280.x___, r284.xxxx, r280.xxxx iand r266.x___, r280.xxxx, r266.xxxx mov r280, r264.y000 if_logicalnz r266.xxxx mov r266, l40.xxxx iand r266.x___, r280.xxxx, r266.xxxx mov r287, l31.xxxx ishl r266.x___, r287.xxxx, r266.xxxx mov r287, l48.xxxx iand r288.x___, r280.xxxx, r287.xxxx mov r289, l23.xxxx ushr r288.x___, r288.xxxx, r289.xxxx iadd r288.x___, r257.xxxx, r288.xxxx mov r1010.x___, r288.xxxx uav_raw_load_id(11)_cached r1011.x___, r1010.xxxx mov r288.x___, r1011.xxxx iand r266.x___, r288.xxxx, r266.xxxx mov r288, l46.xxxx ieq r266.x___, r266.xxxx, r288.xxxx if_logicalnz r266.xxxx else iand r266.x___, r279.xxxx, r287.xxxx ushr r266.x___, r266.xxxx, r289.xxxx iadd r266.x___, r266.xxxx, r257.xxxx mov r287, l49.xxxx iadd r266.x___, r266.xxxx, r287.xxxx mov r1010.x___, r266.xxxx uav_raw_load_id(11)_cached r1011.x___, r1010.xxxx mov r266.x___, r1011.xxxx iand r266.x___, r266.xxxx, r285.xxxx ieq r266.x___, r266.xxxx, r288.xxxx if_logicalnz r266.xxxx else mov r266, l48.xxxx iand r285.x___, r286.xxxx, r266.xxxx mov r287, l23.xxxx ushr r285.x___, r285.xxxx, r287.xxxx iadd r285.x___, r285.xxxx, r257.xxxx mov r288, l50.xxxx iadd r285.x___, r285.xxxx, r288.xxxx mov r1010.x___, r285.xxxx uav_raw_load_id(11)_cached r1011.x___, r1010.xxxx mov r285.x___, r1011.xxxx iand r277.x___, r285.xxxx, r277.xxxx mov r285, l46.xxxx ieq r277.x___, r277.xxxx, r285.xxxx if_logicalnz r277.xxxx else iand r266.x___, r276.xxxx, r266.xxxx ushr r266.x___, r266.xxxx, r287.xxxx iadd r266.x___, r266.xxxx, r257.xxxx mov r277, l51.xxxx iadd r266.x___, r266.xxxx, r277.xxxx mov r1010.x___, r266.xxxx uav_raw_load_id(11)_cached r1011.x___, r1010.xxxx mov r266.x___, r1011.xxxx iand r266.x___, r266.xxxx, r282.xxxx ieq r266.x___, r266.xxxx, r285.xxxx if_logicalnz r266.xxxx else mov r273, l31.xxxx endif endif endif endif else endif mov r277, r265.z000 mov r282, l40.xxxx iand r285.x___, r277.xxxx, r282.xxxx mov r266, l31.xxxx ishl r285.x___, r266.xxxx, r285.xxxx iand r284.x___, r285.xxxx, r283.xxxx mov r287, l46.xxxx ine r284.x___, r284.xxxx, r287.xxxx mov r284.x___, r284.xxxx mov r288, r267.z000 iand r289.x___, r288.xxxx, r282.xxxx ishl r289.x___, r266.xxxx, r289.xxxx iand r290.x___, r289.xxxx, r270.xxxx ine r290.x___, r290.xxxx, r287.xxxx mov r290.x___, r290.xxxx iand r284.x___, r290.xxxx, r284.xxxx mov r290, r272.z000 iand r291.x___, r290.xxxx, r282.xxxx ishl r291.x___, r266.xxxx, r291.xxxx iand r292.x___, r291.xxxx, r254.xxxx ine r292.x___, r292.xxxx, r287.xxxx mov r292.x___, r292.xxxx iand r284.x___, r284.xxxx, r292.xxxx iand r284.x___, r284.xxxx, r266.xxxx mov r292, r264.z000 if_logicalnz r284.xxxx iand r284.x___, r292.xxxx, r282.xxxx ishl r284.x___, r266.xxxx, r284.xxxx mov r293, l48.xxxx iand r294.x___, r292.xxxx, r293.xxxx mov r295, l23.xxxx ushr r294.x___, r294.xxxx, r295.xxxx iadd r294.x___, r257.xxxx, r294.xxxx mov r1010.x___, r294.xxxx uav_raw_load_id(11)_cached r1011.x___, r1010.xxxx mov r294.x___, r1011.xxxx iand r284.x___, r294.xxxx, r284.xxxx ieq r284.x___, r284.xxxx, r287.xxxx if_logicalnz r284.xxxx else iand r284.x___, r288.xxxx, r293.xxxx ushr r284.x___, r284.xxxx, r295.xxxx iadd r284.x___, r284.xxxx, r257.xxxx mov r293, l49.xxxx iadd r284.x___, r284.xxxx, r293.xxxx mov r1010.x___, r284.xxxx uav_raw_load_id(11)_cached r1011.x___, r1010.xxxx mov r284.x___, r1011.xxxx iand r284.x___, r284.xxxx, r289.xxxx mov r289, l46.xxxx ieq r284.x___, r284.xxxx, r289.xxxx if_logicalnz r284.xxxx else mov r284, l48.xxxx iand r293.x___, r290.xxxx, r284.xxxx mov r294, l23.xxxx ushr r293.x___, r293.xxxx, r294.xxxx iadd r293.x___, r293.xxxx, r257.xxxx mov r295, l50.xxxx iadd r293.x___, r293.xxxx, r295.xxxx mov r1010.x___, r293.xxxx uav_raw_load_id(11)_cached r1011.x___, r1010.xxxx mov r293.x___, r1011.xxxx iand r291.x___, r293.xxxx, r291.xxxx ieq r289.x___, r291.xxxx, r289.xxxx if_logicalnz r289.xxxx else iand r284.x___, r277.xxxx, r284.xxxx ushr r284.x___, r284.xxxx, r294.xxxx iadd r284.x___, r284.xxxx, r257.xxxx mov r289, l51.xxxx iadd r284.x___, r284.xxxx, r289.xxxx mov r1010.x___, r284.xxxx uav_raw_load_id(11)_cached r1011.x___, r1010.xxxx mov r284.x___, r1011.xxxx iand r284.x___, r284.xxxx, r285.xxxx mov r285, l46.xxxx ieq r284.x___, r284.xxxx, r285.xxxx if_logicalnz r284.xxxx else mov r273, l31.xxxx endif endif endif endif else endif iand r260, r260, r268 iand r268, r271, r269 ior r260, r260, r268 mov r265, r265.w000 iand r268.x___, r265.xxxx, r282.xxxx ishl r268.x___, r266.xxxx, r268.xxxx iand r269.x___, r268.xxxx, r283.xxxx ine r269.x___, r269.xxxx, r287.xxxx mov r269.x___, r269.xxxx mov r267, r267.w000 iand r271.x___, r267.xxxx, r282.xxxx ishl r271.x___, r266.xxxx, r271.xxxx iand r285.x___, r271.xxxx, r270.xxxx ine r285.x___, r285.xxxx, r287.xxxx mov r285.x___, r285.xxxx iand r269.x___, r285.xxxx, r269.xxxx mov r272, r272.w000 iand r282.x___, r272.xxxx, r282.xxxx ishl r282.x___, r266.xxxx, r282.xxxx iand r285.x___, r282.xxxx, r254.xxxx ine r285.x___, r285.xxxx, r287.xxxx mov r285.x___, r285.xxxx iand r269.x___, r269.xxxx, r285.xxxx iand r266.x___, r269.xxxx, r266.xxxx mov r264, r264.w000 if_logicalnz r266.xxxx mov r266, l40.xxxx iand r266.x___, r264.xxxx, r266.xxxx mov r269, l31.xxxx ishl r266.x___, r269.xxxx, r266.xxxx mov r269, l48.xxxx iand r284.x___, r264.xxxx, r269.xxxx mov r285, l23.xxxx ushr r284.x___, r284.xxxx, r285.xxxx iadd r284.x___, r257.xxxx, r284.xxxx mov r1010.x___, r284.xxxx uav_raw_load_id(11)_cached r1011.x___, r1010.xxxx mov r284.x___, r1011.xxxx iand r266.x___, r284.xxxx, r266.xxxx mov r284, l46.xxxx ieq r266.x___, r266.xxxx, r284.xxxx if_logicalnz r266.xxxx mov r3725, l31.xxxx else iand r266.x___, r267.xxxx, r269.xxxx ushr r266.x___, r266.xxxx, r285.xxxx iadd r266.x___, r266.xxxx, r257.xxxx mov r269, l49.xxxx iadd r266.x___, r266.xxxx, r269.xxxx mov r1010.x___, r266.xxxx uav_raw_load_id(11)_cached r1011.x___, r1010.xxxx mov r266.x___, r1011.xxxx iand r266.x___, r266.xxxx, r271.xxxx ieq r266.x___, r266.xxxx, r284.xxxx if_logicalnz r266.xxxx mov r3725, l31.xxxx else mov r3725, l46.xxxx mov r266, l48.xxxx iand r269.x___, r272.xxxx, r266.xxxx mov r271, l23.xxxx ushr r269.x___, r269.xxxx, r271.xxxx iadd r269.x___, r269.xxxx, r257.xxxx mov r284, l50.xxxx iadd r269.x___, r269.xxxx, r284.xxxx mov r1010.x___, r269.xxxx uav_raw_load_id(11)_cached r1011.x___, r1010.xxxx mov r269.x___, r1011.xxxx iand r269.x___, r269.xxxx, r282.xxxx mov r282, l46.xxxx ieq r269.x___, r269.xxxx, r282.xxxx if_logicalnz r269.xxxx mov r3725, l31.xxxx else iand r266.x___, r265.xxxx, r266.xxxx ushr r266.x___, r266.xxxx, r271.xxxx iadd r266.x___, r266.xxxx, r257.xxxx mov r269, l51.xxxx iadd r266.x___, r266.xxxx, r269.xxxx mov r1010.x___, r266.xxxx uav_raw_load_id(11)_cached r1011.x___, r1010.xxxx mov r266.x___, r1011.xxxx iand r266.x___, r266.xxxx, r268.xxxx ieq r266.x___, r266.xxxx, r282.xxxx if_logicalnz r266.xxxx mov r268, l46.xxxx ieq r268.x___, r273.xxxx, r268.xxxx if_logicalnz r268.xxxx else mov r268, l31.xxxx ieq r268.x___, r273.xxxx, r268.xxxx if_logicalnz r268.xxxx mov r266, l31.xxxx mov r1011.x___, r266.xxxx mov r1010.x___, r256.xxxx uav_raw_store_id(11) mem.x___, r1010.xxxx, r1011.xxxx mov r268, r1021.xyz0 mov r268, r268.x000 mov r269, l12.xxxx ishl r268.x___, r268.xxxx, r269.xxxx iadd r268.x___, r258.xxxx, r268.xxxx mov r1011.x___, r266.xxxx mov r1010.x___, r268.xxxx uav_raw_store_id(11) mem.x___, r1010.xxxx, r1011.xxxx else endif mov r274, r274.xxxx iadd r274, r274.x0zw, r275.0x00 mov r275, r274.x000 mov r275, r275.xxxx iadd r275, r275.xy0w, r281.00x0 mov r274, r274.y000 iadd r274, r275.x0zw, r274.0x00 mov r275, r274.x000 mov r275, r275.xxxx iadd r275, r275.xyz0, r278.000x mov r278, r274.z000 iadd r275, r275.xy0w, r278.00x0 mov r274, r274.y000 iadd r274, r275.x0zw, r274.0x00 mov r275, r1021.xyz0 mov r275, r275.x000 mov r278, l21.xxxx imul r275.x___, r275.xxxx, r278.xxxx mov r281, l30.xxxx ishl r275.x___, r275.xxxx, r281.xxxx iadd r275.x___, r263.xxxx, r275.xxxx mov r1011, r274 mov r1010.x___, r275.xxxx uav_raw_store_id(11) mem, r1010.xxxx, r1011 mov r274, r260.x000 mov r274, r274.xxxx iadd r274, r274.x0zw, r280.0x00 mov r275, r274.x000 mov r275, r275.xxxx iadd r275, r275.xy0w, r279.00x0 mov r274, r274.y000 iadd r274, r275.x0zw, r274.0x00 mov r275, r274.x000 mov r275, r275.xxxx iadd r275, r275.xyz0, r286.000x mov r279, r274.z000 iadd r275, r275.xy0w, r279.00x0 mov r274, r274.y000 iadd r274, r275.x0zw, r274.0x00 mov r275, r1021.xyz0 mov r275, r275.x000 imul r275.x___, r275.xxxx, r278.xxxx mov r279, l31.xxxx ior r275.x___, r275.xxxx, r279.xxxx ishl r275.x___, r275.xxxx, r281.xxxx iadd r275.x___, r263.xxxx, r275.xxxx mov r1011, r274 mov r1010.x___, r275.xxxx uav_raw_store_id(11) mem, r1010.xxxx, r1011 mov r274, r260.y000 mov r275, r276.xxxx iadd r274, r275.x0zw, r274.0x00 mov r275, r274.x000 mov r275, r275.xxxx iadd r275, r275.xy0w, r292.00x0 mov r274, r274.y000 iadd r274, r275.x0zw, r274.0x00 mov r275, r274.x000 mov r275, r275.xxxx iadd r275, r275.xyz0, r288.000x mov r276, r274.z000 iadd r275, r275.xy0w, r276.00x0 mov r274, r274.y000 iadd r274, r275.x0zw, r274.0x00 mov r275, r1021.xyz0 mov r275, r275.x000 imul r275.x___, r275.xxxx, r278.xxxx ishl r275.x___, r275.xxxx, r281.xxxx iadd r275.x___, r275.xxxx, r263.xxxx mov r276, l11.xxxx iadd r275.x___, r275.xxxx, r276.xxxx mov r1011, r274 mov r1010.x___, r275.xxxx uav_raw_store_id(11) mem, r1010.xxxx, r1011 mov r274, r290.xxxx iadd r274, r274.x0zw, r277.0x00 mov r275, r274.x000 mov r275, r275.xxxx mov r276, r260.z000 iadd r275, r275.xy0w, r276.00x0 mov r274, r274.y000 iadd r274, r275.x0zw, r274.0x00 mov r275, r274.x000 mov r275, r275.xxxx iadd r264, r275.xyz0, r264.000x mov r275, r274.z000 iadd r264, r264.xy0w, r275.00x0 mov r274, r274.y000 iadd r264, r264.x0zw, r274.0x00 mov r274, r1021.xyz0 mov r274, r274.x000 imul r274.x___, r274.xxxx, r278.xxxx ishl r274.x___, r274.xxxx, r281.xxxx iadd r274.x___, r274.xxxx, r263.xxxx mov r275, l52.xxxx iadd r274.x___, r274.xxxx, r275.xxxx mov r1011, r264 mov r1010.x___, r274.xxxx uav_raw_store_id(11) mem, r1010.xxxx, r1011 mov r264, r267.xxxx iadd r264, r264.x0zw, r272.0x00 mov r267, r264.x000 mov r267, r267.xxxx iadd r265, r267.xy0w, r265.00x0 mov r264, r264.y000 iadd r264, r265.x0zw, r264.0x00 mov r265, r264.x000 mov r265, r265.xxxx mov r260, r260.w000 iadd r260, r265.xyz0, r260.000x mov r265, r264.z000 iadd r260, r260.xy0w, r265.00x0 mov r264, r264.y000 iadd r260, r260.x0zw, r264.0x00 mov r264, r1021.xyz0 mov r264, r264.x000 imul r264.x___, r264.xxxx, r278.xxxx ishl r264.x___, r264.xxxx, r281.xxxx iadd r264.x___, r264.xxxx, r263.xxxx mov r265, l53.xxxx iadd r264.x___, r264.xxxx, r265.xxxx mov r1011, r260 mov r1010.x___, r264.xxxx uav_raw_store_id(11) mem, r1010.xxxx, r1011 endif else mov r266, l31.xxxx mov r1011.x___, r266.xxxx mov r1010.x___, r256.xxxx uav_raw_store_id(11) mem.x___, r1010.xxxx, r1011.xxxx mov r268, r1021.xyz0 mov r268, r268.x000 mov r269, l12.xxxx ishl r268.x___, r268.xxxx, r269.xxxx iadd r268.x___, r258.xxxx, r268.xxxx mov r1011.x___, r266.xxxx mov r1010.x___, r268.xxxx uav_raw_store_id(11) mem.x___, r1010.xxxx, r1011.xxxx mov r274, r274.xxxx iadd r274, r274.x0zw, r275.0x00 mov r275, r274.x000 mov r275, r275.xxxx iadd r275, r275.xy0w, r281.00x0 mov r274, r274.y000 iadd r274, r275.x0zw, r274.0x00 mov r275, r274.x000 mov r275, r275.xxxx iadd r275, r275.xyz0, r278.000x mov r278, r274.z000 iadd r275, r275.xy0w, r278.00x0 mov r274, r274.y000 iadd r274, r275.x0zw, r274.0x00 mov r275, r1021.xyz0 mov r275, r275.x000 mov r278, l21.xxxx imul r275.x___, r275.xxxx, r278.xxxx mov r281, l30.xxxx ishl r275.x___, r275.xxxx, r281.xxxx iadd r275.x___, r263.xxxx, r275.xxxx mov r1011, r274 mov r1010.x___, r275.xxxx uav_raw_store_id(11) mem, r1010.xxxx, r1011 mov r274, r260.x000 mov r274, r274.xxxx iadd r274, r274.x0zw, r280.0x00 mov r275, r274.x000 mov r275, r275.xxxx iadd r275, r275.xy0w, r279.00x0 mov r274, r274.y000 iadd r274, r275.x0zw, r274.0x00 mov r275, r274.x000 mov r275, r275.xxxx iadd r275, r275.xyz0, r286.000x mov r279, r274.z000 iadd r275, r275.xy0w, r279.00x0 mov r274, r274.y000 iadd r274, r275.x0zw, r274.0x00 mov r275, r1021.xyz0 mov r275, r275.x000 imul r275.x___, r275.xxxx, r278.xxxx mov r279, l31.xxxx ior r275.x___, r275.xxxx, r279.xxxx ishl r275.x___, r275.xxxx, r281.xxxx iadd r275.x___, r263.xxxx, r275.xxxx mov r1011, r274 mov r1010.x___, r275.xxxx uav_raw_store_id(11) mem, r1010.xxxx, r1011 mov r274, r260.y000 mov r275, r276.xxxx iadd r274, r275.x0zw, r274.0x00 mov r275, r274.x000 mov r275, r275.xxxx iadd r275, r275.xy0w, r292.00x0 mov r274, r274.y000 iadd r274, r275.x0zw, r274.0x00 mov r275, r274.x000 mov r275, r275.xxxx iadd r275, r275.xyz0, r288.000x mov r276, r274.z000 iadd r275, r275.xy0w, r276.00x0 mov r274, r274.y000 iadd r274, r275.x0zw, r274.0x00 mov r275, r1021.xyz0 mov r275, r275.x000 imul r275.x___, r275.xxxx, r278.xxxx ishl r275.x___, r275.xxxx, r281.xxxx iadd r275.x___, r275.xxxx, r263.xxxx mov r276, l11.xxxx iadd r275.x___, r275.xxxx, r276.xxxx mov r1011, r274 mov r1010.x___, r275.xxxx uav_raw_store_id(11) mem, r1010.xxxx, r1011 mov r274, r290.xxxx iadd r274, r274.x0zw, r277.0x00 mov r275, r274.x000 mov r275, r275.xxxx mov r276, r260.z000 iadd r275, r275.xy0w, r276.00x0 mov r274, r274.y000 iadd r274, r275.x0zw, r274.0x00 mov r275, r274.x000 mov r275, r275.xxxx iadd r264, r275.xyz0, r264.000x mov r275, r274.z000 iadd r264, r264.xy0w, r275.00x0 mov r274, r274.y000 iadd r264, r264.x0zw, r274.0x00 mov r274, r1021.xyz0 mov r274, r274.x000 imul r274.x___, r274.xxxx, r278.xxxx ishl r274.x___, r274.xxxx, r281.xxxx iadd r274.x___, r274.xxxx, r263.xxxx mov r275, l52.xxxx iadd r274.x___, r274.xxxx, r275.xxxx mov r1011, r264 mov r1010.x___, r274.xxxx uav_raw_store_id(11) mem, r1010.xxxx, r1011 mov r264, r267.xxxx iadd r264, r264.x0zw, r272.0x00 mov r267, r264.x000 mov r267, r267.xxxx iadd r265, r267.xy0w, r265.00x0 mov r264, r264.y000 iadd r264, r265.x0zw, r264.0x00 mov r265, r264.x000 mov r265, r265.xxxx mov r260, r260.w000 iadd r260, r265.xyz0, r260.000x mov r265, r264.z000 iadd r260, r260.xy0w, r265.00x0 mov r264, r264.y000 iadd r260, r260.x0zw, r264.0x00 mov r264, r1021.xyz0 mov r264, r264.x000 imul r264.x___, r264.xxxx, r278.xxxx ishl r264.x___, r264.xxxx, r281.xxxx iadd r264.x___, r264.xxxx, r263.xxxx mov r265, l53.xxxx iadd r264.x___, r264.xxxx, r265.xxxx mov r1011, r260 mov r1010.x___, r264.xxxx uav_raw_store_id(11) mem, r1010.xxxx, r1011 endif endif endif endif else mov r3725, l31.xxxx endif if_logicalnz r3725.xxxx mov r268, l46.xxxx ieq r268.x___, r273.xxxx, r268.xxxx if_logicalnz r268.xxxx else mov r268, l31.xxxx ieq r268.x___, r273.xxxx, r268.xxxx if_logicalnz r268.xxxx mov r266, l31.xxxx mov r1011.x___, r266.xxxx mov r1010.x___, r256.xxxx uav_raw_store_id(11) mem.x___, r1010.xxxx, r1011.xxxx mov r268, r1021.xyz0 mov r268, r268.x000 mov r269, l12.xxxx ishl r268.x___, r268.xxxx, r269.xxxx iadd r268.x___, r258.xxxx, r268.xxxx mov r1011.x___, r266.xxxx mov r1010.x___, r268.xxxx uav_raw_store_id(11) mem.x___, r1010.xxxx, r1011.xxxx else endif mov r274, r274.xxxx iadd r274, r274.x0zw, r275.0x00 mov r275, r274.x000 mov r275, r275.xxxx iadd r275, r275.xy0w, r281.00x0 mov r274, r274.y000 iadd r274, r275.x0zw, r274.0x00 mov r275, r274.x000 mov r275, r275.xxxx iadd r275, r275.xyz0, r278.000x mov r278, r274.z000 iadd r275, r275.xy0w, r278.00x0 mov r274, r274.y000 iadd r274, r275.x0zw, r274.0x00 mov r275, r1021.xyz0 mov r275, r275.x000 mov r278, l21.xxxx imul r275.x___, r275.xxxx, r278.xxxx mov r281, l30.xxxx ishl r275.x___, r275.xxxx, r281.xxxx iadd r275.x___, r263.xxxx, r275.xxxx mov r1011, r274 mov r1010.x___, r275.xxxx uav_raw_store_id(11) mem, r1010.xxxx, r1011 mov r274, r260.x000 mov r274, r274.xxxx iadd r274, r274.x0zw, r280.0x00 mov r275, r274.x000 mov r275, r275.xxxx iadd r275, r275.xy0w, r279.00x0 mov r274, r274.y000 iadd r274, r275.x0zw, r274.0x00 mov r275, r274.x000 mov r275, r275.xxxx iadd r275, r275.xyz0, r286.000x mov r279, r274.z000 iadd r275, r275.xy0w, r279.00x0 mov r274, r274.y000 iadd r274, r275.x0zw, r274.0x00 mov r275, r1021.xyz0 mov r275, r275.x000 imul r275.x___, r275.xxxx, r278.xxxx mov r279, l31.xxxx ior r275.x___, r275.xxxx, r279.xxxx ishl r275.x___, r275.xxxx, r281.xxxx iadd r275.x___, r263.xxxx, r275.xxxx mov r1011, r274 mov r1010.x___, r275.xxxx uav_raw_store_id(11) mem, r1010.xxxx, r1011 mov r274, r260.y000 mov r275, r276.xxxx iadd r274, r275.x0zw, r274.0x00 mov r275, r274.x000 mov r275, r275.xxxx iadd r275, r275.xy0w, r292.00x0 mov r274, r274.y000 iadd r274, r275.x0zw, r274.0x00 mov r275, r274.x000 mov r275, r275.xxxx iadd r275, r275.xyz0, r288.000x mov r276, r274.z000 iadd r275, r275.xy0w, r276.00x0 mov r274, r274.y000 iadd r274, r275.x0zw, r274.0x00 mov r275, r1021.xyz0 mov r275, r275.x000 imul r275.x___, r275.xxxx, r278.xxxx ishl r275.x___, r275.xxxx, r281.xxxx iadd r275.x___, r275.xxxx, r263.xxxx mov r276, l11.xxxx iadd r275.x___, r275.xxxx, r276.xxxx mov r1011, r274 mov r1010.x___, r275.xxxx uav_raw_store_id(11) mem, r1010.xxxx, r1011 mov r274, r290.xxxx iadd r274, r274.x0zw, r277.0x00 mov r275, r274.x000 mov r275, r275.xxxx mov r276, r260.z000 iadd r275, r275.xy0w, r276.00x0 mov r274, r274.y000 iadd r274, r275.x0zw, r274.0x00 mov r275, r274.x000 mov r275, r275.xxxx iadd r264, r275.xyz0, r264.000x mov r275, r274.z000 iadd r264, r264.xy0w, r275.00x0 mov r274, r274.y000 iadd r264, r264.x0zw, r274.0x00 mov r274, r1021.xyz0 mov r274, r274.x000 imul r274.x___, r274.xxxx, r278.xxxx ishl r274.x___, r274.xxxx, r281.xxxx iadd r274.x___, r274.xxxx, r263.xxxx mov r275, l52.xxxx iadd r274.x___, r274.xxxx, r275.xxxx mov r1011, r264 mov r1010.x___, r274.xxxx uav_raw_store_id(11) mem, r1010.xxxx, r1011 mov r264, r267.xxxx iadd r264, r264.x0zw, r272.0x00 mov r267, r264.x000 mov r267, r267.xxxx iadd r265, r267.xy0w, r265.00x0 mov r264, r264.y000 iadd r264, r265.x0zw, r264.0x00 mov r265, r264.x000 mov r265, r265.xxxx mov r260, r260.w000 iadd r260, r265.xyz0, r260.000x mov r265, r264.z000 iadd r260, r260.xy0w, r265.00x0 mov r264, r264.y000 iadd r260, r260.x0zw, r264.0x00 mov r264, r1021.xyz0 mov r264, r264.x000 imul r264.x___, r264.xxxx, r278.xxxx ishl r264.x___, r264.xxxx, r281.xxxx iadd r264.x___, r264.xxxx, r263.xxxx mov r265, l53.xxxx iadd r264.x___, r264.xxxx, r265.xxxx mov r1011, r260 mov r1010.x___, r264.xxxx uav_raw_store_id(11) mem, r1010.xxxx, r1011 endif else endif mov r260, l13.xxxx ubit_extract r264.x___, r260.xxxx, r260.xxxx, r255.xxxx mov r265, l14.xxxx ubit_extract r266.x___, r260.xxxx, r265.xxxx, r255.xxxx mov r267, l10.xxxx ishl r267.x___, r261.xxxx, r267.xxxx mov r268, l11.xxxx iadd r267.x___, r267.xxxx, r268.xxxx mov r268, r267.xxxx iadd r268, r268.xyz0, r267.000x iadd r268, r268.xy0w, r267.00x0 iadd r267, r268.x0zw, r267.0x00 mov r268, l16.xxxx ushr r268.x___, r255.xxxx, r268.xxxx ushr r265.x___, r255.xxxx, r265.xxxx ushr r269.x___, r255.xxxx, r260.xxxx mov r271, l17.xxxx iand r271.x___, r255.xxxx, r271.xxxx mov r272, l15.xxxx ilt r272.x___, r272.xxxx, r261.xxxx mov r273, r262.w000 mov r274, r273.xxxx iadd r274, r274.xyz0, r273.000x iadd r274, r274.xy0w, r273.00x0 iadd r273, r274.x0zw, r273.0x00 mov r274, r262.z000 mov r275, r274.xxxx iadd r275, r275.xyz0, r274.000x iadd r275, r275.xy0w, r274.00x0 iadd r274, r275.x0zw, r274.0x00 mov r275, r262.y000 mov r276, r275.xxxx iadd r276, r276.xyz0, r275.000x iadd r276, r276.xy0w, r275.00x0 iadd r275, r276.x0zw, r275.0x00 mov r262, r262.x000 mov r276, r262.xxxx iadd r276, r276.xyz0, r262.000x iadd r276, r276.xy0w, r262.00x0 iadd r262, r276.x0zw, r262.0x00 if_logicalnz r272.xxxx ilt r260.x___, r260.xxxx, r261.xxxx if_logicalnz r260.xxxx mov r260, l18.xxxx ieq r260.x___, r261.xxxx, r260.xxxx if_logicalnz r260.xxxx mov r265, l14.xxxx ishl r265.x___, r264.xxxx, r265.xxxx mov r266, r265.xxxx iadd r266, r266.xyz0, r265.000x iadd r266, r266.xy0w, r265.00x0 iadd r265, r266.x0zw, r265.0x00 mov r266, l13.xxxx ishl r269.x___, r271.xxxx, r266.xxxx mov r271, r269.xxxx iadd r271, r271.xyz0, r269.000x iadd r271, r271.xy0w, r269.00x0 iadd r269, r271.x0zw, r269.0x00 ior r259, r269, r259 ior r259, r259, r265 ishl r255.x___, r255.xxxx, r266.xxxx mov r265, l19.xxxx iand r255.x___, r255.xxxx, r265.xxxx mov r265, r255.xxxx iadd r265, r265.xyz0, r255.000x iadd r265, r265.xy0w, r255.00x0 iadd r255, r265.x0zw, r255.0x00 ior r274, r259, r255 mov r255, l20.xxxx ior r255.x___, r268.xxxx, r255.xxxx mov r259, r255.xxxx iadd r259, r259.xyz0, r255.000x iadd r259, r259.xy0w, r255.00x0 iadd r273, r259.x0zw, r255.0x00 else mov r264, l21.xxxx ieq r264.x___, r261.xxxx, r264.xxxx if_logicalnz r264.xxxx mov r266, l14.xxxx ishl r255.x___, r255.xxxx, r266.xxxx mov r269, l19.xxxx iand r255.x___, r255.xxxx, r269.xxxx mov r269, r255.xxxx iadd r269, r269.xyz0, r255.000x iadd r269, r269.xy0w, r255.00x0 iadd r255, r269.x0zw, r255.0x00 ishl r266.x___, r271.xxxx, r266.xxxx mov r269, r266.xxxx iadd r269, r269.xyz0, r266.000x iadd r269, r269.xy0w, r266.00x0 iadd r266, r269.x0zw, r266.0x00 mov r269, l13.xxxx mov r273, r269.xxxx iadd r273, r273.xyz0, r269.000x iadd r273, r273.xy0w, r269.00x0 iadd r269, r273.x0zw, r269.0x00 ishl r259, r259, r269 ior r259, r259, r266 ior r255, r259, r255 ior r274, r274, r255 mov r255, l22.xxxx ior r255.x___, r265.xxxx, r255.xxxx mov r259, r255.xxxx iadd r259, r259.xyz0, r255.000x iadd r259, r259.xy0w, r255.00x0 iadd r273, r259.x0zw, r255.0x00 else mov r265, l23.xxxx ieq r265.x___, r261.xxxx, r265.xxxx if_logicalnz r265.xxxx mov r273, l13.xxxx ishl r273.x___, r266.xxxx, r273.xxxx mov r260, l24.xxxx iand r260.x___, r269.xxxx, r260.xxxx ior r273.x___, r260.xxxx, r273.xxxx mov r260, l25.xxxx ior r273.x___, r273.xxxx, r260.xxxx mov r260, r273.xxxx iadd r260, r260.xyz0, r273.000x iadd r260, r260.xy0w, r273.00x0 iadd r273, r260.x0zw, r273.0x00 mov r260, l16.xxxx ishl r255.x___, r255.xxxx, r260.xxxx mov r260, r255.xxxx iadd r260, r260.xyz0, r255.000x iadd r260, r260.xy0w, r255.00x0 iadd r255, r260.x0zw, r255.0x00 mov r260, l14.xxxx mov r264, r260.xxxx iadd r264, r264.xyz0, r260.000x iadd r264, r264.xy0w, r260.00x0 iadd r260, r264.x0zw, r260.0x00 ishl r259, r259, r260 ior r255, r259, r255 ior r274, r274, r255 else endif endif endif else mov r268, l26.xxxx ieq r268.x___, r261.xxxx, r268.xxxx if_logicalnz r268.xxxx mov r264, l14.xxxx ishl r255.x___, r255.xxxx, r264.xxxx mov r266, l19.xxxx iand r255.x___, r255.xxxx, r266.xxxx mov r266, r255.xxxx iadd r266, r266.xyz0, r255.000x iadd r266, r266.xy0w, r255.00x0 iadd r255, r266.x0zw, r255.0x00 ishl r264.x___, r271.xxxx, r264.xxxx mov r266, r264.xxxx iadd r266, r266.xyz0, r264.000x iadd r266, r266.xy0w, r264.00x0 iadd r264, r266.x0zw, r264.0x00 mov r266, l13.xxxx mov r268, r266.xxxx iadd r268, r268.xyz0, r266.000x iadd r268, r268.xy0w, r266.00x0 iadd r266, r268.x0zw, r266.0x00 ishl r259, r259, r266 ior r259, r259, r264 ior r255, r259, r255 ior r275, r275, r255 mov r255, l22.xxxx ior r265.x___, r265.xxxx, r255.xxxx mov r255, r265.xxxx iadd r255, r255.xyz0, r265.000x iadd r255, r255.xy0w, r265.00x0 iadd r274, r255.x0zw, r265.0x00 else mov r265, l27.xxxx ieq r265.x___, r261.xxxx, r265.xxxx if_logicalnz r265.xxxx mov r265, l13.xxxx ishl r265.x___, r266.xxxx, r265.xxxx mov r264, l24.xxxx iand r264.x___, r269.xxxx, r264.xxxx ior r265.x___, r264.xxxx, r265.xxxx mov r264, l25.xxxx ior r265.x___, r265.xxxx, r264.xxxx mov r264, r265.xxxx iadd r264, r264.xyz0, r265.000x iadd r264, r264.xy0w, r265.00x0 iadd r274, r264.x0zw, r265.0x00 mov r265, l16.xxxx ishl r265.x___, r255.xxxx, r265.xxxx mov r255, r265.xxxx iadd r255, r255.xyz0, r265.000x iadd r255, r255.xy0w, r265.00x0 iadd r265, r255.x0zw, r265.0x00 mov r255, l14.xxxx mov r264, r255.xxxx iadd r264, r264.xyz0, r255.000x iadd r264, r264.xy0w, r255.00x0 iadd r255, r264.x0zw, r255.0x00 ishl r255, r259, r255 ior r265, r255, r265 ior r275, r275, r265 else mov r265, l13.xxxx ieq r261.x___, r261.xxxx, r265.xxxx if_logicalnz r261.xxxx ishl r264.x___, r264.xxxx, r265.xxxx mov r265, l28.xxxx iand r255.x___, r255.xxxx, r265.xxxx ior r255.x___, r255.xxxx, r264.xxxx mov r264, l14.xxxx ishl r264.x___, r266.xxxx, r264.xxxx ior r255.x___, r255.xxxx, r264.xxxx mov r264, r255.xxxx iadd r264, r264.xyz0, r255.000x iadd r264, r264.xy0w, r255.00x0 iadd r274, r264.x0zw, r255.0x00 mov r255, l29.xxxx mov r264, r255.xxxx iadd r264, r264.xyz0, r255.000x iadd r264, r264.xy0w, r255.00x0 iadd r273, r264.x0zw, r255.0x00 mov r255, l16.xxxx mov r264, r255.xxxx iadd r264, r264.xyz0, r255.000x iadd r264, r264.xy0w, r255.00x0 iadd r255, r264.x0zw, r255.0x00 ishl r255, r259, r255 ior r275, r275, r255 else endif endif endif endif else mov r260, l12.xxxx ilt r260.x___, r260.xxxx, r261.xxxx if_logicalnz r260.xxxx mov r260, l10.xxxx ieq r260.x___, r261.xxxx, r260.xxxx if_logicalnz r260.xxxx mov r265, l13.xxxx ishl r265.x___, r266.xxxx, r265.xxxx mov r264, l24.xxxx iand r264.x___, r269.xxxx, r264.xxxx ior r265.x___, r264.xxxx, r265.xxxx mov r264, l25.xxxx ior r265.x___, r265.xxxx, r264.xxxx mov r264, r265.xxxx iadd r264, r264.xyz0, r265.000x iadd r264, r264.xy0w, r265.00x0 iadd r275, r264.x0zw, r265.0x00 mov r265, l16.xxxx ishl r265.x___, r255.xxxx, r265.xxxx mov r255, r265.xxxx iadd r255, r255.xyz0, r265.000x iadd r255, r255.xy0w, r265.00x0 iadd r265, r255.x0zw, r265.0x00 mov r255, l14.xxxx mov r264, r255.xxxx iadd r264, r264.xyz0, r255.000x iadd r264, r264.xy0w, r255.00x0 iadd r255, r264.x0zw, r255.0x00 ishl r255, r259, r255 ior r265, r255, r265 ior r262, r262, r265 else mov r260, l30.xxxx ieq r260.x___, r261.xxxx, r260.xxxx if_logicalnz r260.xxxx mov r265, l13.xxxx ishl r265.x___, r264.xxxx, r265.xxxx mov r264, l28.xxxx iand r255.x___, r255.xxxx, r264.xxxx ior r265.x___, r255.xxxx, r265.xxxx mov r255, l14.xxxx ishl r255.x___, r266.xxxx, r255.xxxx ior r265.x___, r265.xxxx, r255.xxxx mov r255, r265.xxxx iadd r255, r255.xyz0, r265.000x iadd r255, r255.xy0w, r265.00x0 iadd r275, r255.x0zw, r265.0x00 mov r265, l29.xxxx mov r255, r265.xxxx iadd r255, r255.xyz0, r265.000x iadd r255, r255.xy0w, r265.00x0 iadd r274, r255.x0zw, r265.0x00 mov r265, l16.xxxx mov r255, r265.xxxx iadd r255, r255.xyz0, r265.000x iadd r255, r255.xy0w, r265.00x0 iadd r265, r255.x0zw, r265.0x00 ishl r265, r259, r265 ior r262, r262, r265 else mov r260, l15.xxxx ieq r260.x___, r261.xxxx, r260.xxxx if_logicalnz r260.xxxx mov r265, l14.xxxx ishl r265.x___, r264.xxxx, r265.xxxx mov r264, r265.xxxx iadd r264, r264.xyz0, r265.000x iadd r264, r264.xy0w, r265.00x0 iadd r265, r264.x0zw, r265.0x00 mov r264, l13.xxxx ishl r266.x___, r271.xxxx, r264.xxxx mov r269, r266.xxxx iadd r269, r269.xyz0, r266.000x iadd r269, r269.xy0w, r266.00x0 iadd r266, r269.x0zw, r266.0x00 ior r259, r266, r259 ior r265, r259, r265 ishl r255.x___, r255.xxxx, r264.xxxx mov r259, l19.xxxx iand r255.x___, r255.xxxx, r259.xxxx mov r259, r255.xxxx iadd r259, r259.xyz0, r255.000x iadd r259, r259.xy0w, r255.00x0 iadd r255, r259.x0zw, r255.0x00 ior r275, r265, r255 mov r265, l20.xxxx ior r265.x___, r268.xxxx, r265.xxxx mov r255, r265.xxxx iadd r255, r255.xyz0, r265.000x iadd r255, r255.xy0w, r265.00x0 iadd r274, r255.x0zw, r265.0x00 else endif endif endif else mov r260, l31.xxxx ieq r260.x___, r261.xxxx, r260.xxxx if_logicalnz r260.xxxx mov r265, l14.xxxx ishl r265.x___, r264.xxxx, r265.xxxx mov r262, r265.xxxx iadd r262, r262.xyz0, r265.000x iadd r262, r262.xy0w, r265.00x0 iadd r265, r262.x0zw, r265.0x00 mov r262, l13.xxxx ishl r264.x___, r271.xxxx, r262.xxxx mov r266, r264.xxxx iadd r266, r266.xyz0, r264.000x iadd r266, r266.xy0w, r264.00x0 iadd r264, r266.x0zw, r264.0x00 ior r259, r264, r259 ior r265, r259, r265 ishl r255.x___, r255.xxxx, r262.xxxx mov r259, l19.xxxx iand r255.x___, r255.xxxx, r259.xxxx mov r259, r255.xxxx iadd r259, r259.xyz0, r255.000x iadd r259, r259.xy0w, r255.00x0 iadd r255, r259.x0zw, r255.0x00 ior r262, r265, r255 mov r265, l20.xxxx ior r265.x___, r268.xxxx, r265.xxxx mov r255, r265.xxxx iadd r255, r255.xyz0, r265.000x iadd r255, r255.xy0w, r265.00x0 iadd r275, r255.x0zw, r265.0x00 else mov r260, l12.xxxx ieq r260.x___, r261.xxxx, r260.xxxx if_logicalnz r260.xxxx mov r264, l14.xxxx ishl r255.x___, r255.xxxx, r264.xxxx mov r266, l19.xxxx iand r255.x___, r255.xxxx, r266.xxxx mov r266, r255.xxxx iadd r266, r266.xyz0, r255.000x iadd r266, r266.xy0w, r255.00x0 iadd r255, r266.x0zw, r255.0x00 ishl r264.x___, r271.xxxx, r264.xxxx mov r266, r264.xxxx iadd r266, r266.xyz0, r264.000x iadd r266, r266.xy0w, r264.00x0 iadd r264, r266.x0zw, r264.0x00 mov r266, l13.xxxx mov r268, r266.xxxx iadd r268, r268.xyz0, r266.000x iadd r268, r268.xy0w, r266.00x0 iadd r266, r268.x0zw, r266.0x00 ishl r259, r259, r266 ior r259, r259, r264 ior r255, r259, r255 ior r262, r262, r255 mov r255, l22.xxxx ior r265.x___, r265.xxxx, r255.xxxx mov r255, r265.xxxx iadd r255, r255.xyz0, r265.000x iadd r255, r255.xy0w, r265.00x0 iadd r275, r255.x0zw, r265.0x00 else endif endif endif endif mov r265, l24.xxxx mov r255, r265.xxxx iadd r255, r255.xyz0, r265.000x iadd r255, r255.xy0w, r265.00x0 iadd r265, r255.x0zw, r265.0x00 mov r255, l16.xxxx mov r259, r255.xxxx iadd r259, r259.xyz0, r255.000x iadd r259, r259.xy0w, r255.00x0 iadd r255, r259.x0zw, r255.0x00 bitalign r259, r262, r262, r255 iand r259, r259, r265 mov r264, l32.xxxx mov r266, r264.xxxx iadd r266, r266.xyz0, r264.000x iadd r266, r266.xy0w, r264.00x0 iadd r264, r266.x0zw, r264.0x00 mov r266, l13.xxxx mov r268, r266.xxxx iadd r268, r268.xyz0, r266.000x iadd r268, r268.xy0w, r266.00x0 iadd r266, r268.x0zw, r266.0x00 bitalign r262, r262, r262, r266 iand r262, r262, r264 ior r259, r259, r262 mov r262, l33.xxxx mov r268, r262.xxxx iadd r268, r268.xyz0, r262.000x iadd r268, r268.xy0w, r262.00x0 iadd r262, r268.x0zw, r262.0x00 mov r268, l34.xxxx mov r269, r268.xxxx iadd r269, r269.xyz0, r268.000x iadd r269, r269.xy0w, r268.00x0 iadd r268, r269.x0zw, r268.0x00 bitalign r269, r268, r268, r262 mov r271, l35.xxxx mov r260, r271.xxxx iadd r260, r260.xyz0, r271.000x iadd r260, r260.xy0w, r271.00x0 iadd r271, r260.x0zw, r271.0x00 iadd r269, r269, r271 iadd r269, r269, r259 mov r271, l36.xxxx mov r260, r271.xxxx iadd r260, r260.xyz0, r271.000x iadd r260, r260.xy0w, r271.00x0 iadd r271, r260.x0zw, r271.0x00 iadd r269, r269, r271 mov r260, l12.xxxx mov r261, r260.xxxx iadd r261, r261.xyz0, r260.000x iadd r261, r261.xy0w, r260.00x0 iadd r260, r261.x0zw, r260.0x00 mov r261, l37.xxxx mov r272, r261.xxxx iadd r272, r272.xyz0, r261.000x iadd r272, r272.xy0w, r261.00x0 iadd r261, r272.x0zw, r261.0x00 bitalign r261, r261, r261, r260 bitalign r272, r275, r275, r255 bitalign r275, r275, r275, r266 bitalign r276, r269, r269, r262 bitalign r268, r268, r268, r260 ixor r277, r268, r261 iand r277, r277, r269 ixor r277, r277, r261 iand r272, r272, r265 iand r275, r275, r264 ior r275, r272, r275 mov r272, l38.xxxx mov r278, r272.xxxx iadd r278, r278.xyz0, r272.000x iadd r278, r278.xy0w, r272.00x0 iadd r272, r278.x0zw, r272.0x00 iadd r276, r276, r272 mov r278, l39.xxxx mov r279, r278.xxxx iadd r279, r279.xyz0, r278.000x iadd r279, r279.xy0w, r278.00x0 iadd r278, r279.x0zw, r278.0x00 ior r279, r261, r278 iadd r276, r276, r279 iadd r276, r276, r275 iadd r276, r276, r271 bitalign r279, r274, r274, r255 bitalign r274, r274, r274, r266 bitalign r280, r276, r276, r262 iadd r280, r280, r278 iadd r277, r280, r277 iand r279, r279, r265 iand r274, r274, r264 ior r274, r279, r274 iadd r277, r277, r274 iadd r277, r277, r271 bitalign r269, r269, r269, r260 bitalign r279, r273, r273, r255 bitalign r273, r273, r273, r266 bitalign r280, r277, r277, r262 bitalign r281, r276, r276, r260 ixor r282, r281, r269 iand r282, r282, r277 ixor r282, r282, r269 iadd r261, r261, r280 ixor r280, r269, r268 iand r276, r280, r276 ixor r276, r276, r268 iadd r261, r261, r276 iand r276, r279, r265 iand r273, r273, r264 ior r273, r276, r273 iadd r261, r261, r273 iadd r261, r261, r271 bitalign r276, r261, r261, r262 iadd r268, r268, r276 iadd r268, r268, r282 iadd r268, r268, r271 bitalign r276, r277, r277, r260 bitalign r277, r268, r268, r262 bitalign r279, r261, r261, r260 ixor r280, r279, r276 iand r280, r280, r268 ixor r280, r280, r276 ixor r282, r276, r281 iand r261, r282, r261 ixor r261, r261, r281 iadd r269, r269, r277 iadd r269, r269, r261 iadd r269, r269, r271 bitalign r261, r269, r269, r262 iadd r261, r281, r261 iadd r261, r261, r280 iadd r261, r261, r271 bitalign r268, r268, r268, r260 bitalign r277, r261, r261, r262 bitalign r280, r269, r269, r260 ixor r281, r280, r268 iand r281, r281, r261 ixor r281, r281, r268 iadd r276, r276, r277 ixor r277, r268, r279 iand r269, r277, r269 ixor r269, r269, r279 iadd r269, r276, r269 iadd r269, r269, r271 bitalign r276, r269, r269, r262 iadd r276, r279, r276 iadd r276, r276, r281 iadd r276, r276, r271 bitalign r261, r261, r261, r260 bitalign r277, r276, r276, r262 bitalign r279, r269, r269, r260 ixor r281, r279, r261 iand r281, r281, r276 ixor r281, r281, r261 ixor r282, r261, r280 iand r269, r282, r269 ixor r269, r269, r280 iadd r268, r268, r277 iadd r268, r268, r269 iadd r268, r268, r271 bitalign r269, r268, r268, r262 iadd r269, r280, r269 iadd r269, r269, r281 iadd r269, r269, r271 bitalign r276, r276, r276, r260 bitalign r277, r269, r269, r262 bitalign r280, r268, r268, r260 ixor r281, r280, r276 iand r281, r281, r269 ixor r281, r281, r276 iadd r261, r261, r277 ixor r277, r276, r279 iand r268, r277, r268 ixor r268, r268, r279 iadd r268, r261, r268 iadd r268, r268, r271 bitalign r261, r268, r268, r262 iadd r261, r279, r261 iadd r261, r261, r281 iadd r261, r261, r271 bitalign r269, r269, r269, r260 bitalign r277, r261, r261, r262 bitalign r279, r268, r268, r260 ixor r281, r279, r269 iand r281, r281, r261 ixor r281, r281, r269 ixor r282, r269, r280 iand r268, r282, r268 ixor r268, r268, r280 iadd r276, r276, r277 iadd r268, r276, r268 iadd r268, r268, r271 bitalign r276, r268, r268, r262 iadd r276, r280, r276 iadd r276, r276, r281 iadd r276, r276, r271 bitalign r261, r261, r261, r260 bitalign r277, r276, r276, r262 bitalign r280, r268, r268, r260 ixor r281, r280, r261 iand r281, r281, r276 ixor r281, r281, r261 iadd r269, r269, r277 ixor r277, r261, r279 iand r268, r277, r268 ixor r268, r268, r279 iadd r268, r269, r268 iadd r268, r268, r267 iadd r268, r268, r271 ixor r259, r274, r259 mov r269, l40.xxxx mov r277, r269.xxxx iadd r277, r277.xyz0, r269.000x iadd r277, r277.xy0w, r269.00x0 iadd r277, r277.x0zw, r269.0x00 bitalign r259, r259, r259, r277 bitalign r282, r268, r268, r262 iadd r279, r279, r282 iadd r279, r279, r281 iadd r279, r279, r259 iadd r279, r279, r271 bitalign r276, r276, r276, r260 ixor r275, r273, r275 bitalign r275, r275, r275, r277 bitalign r281, r279, r279, r262 bitalign r282, r268, r268, r260 ixor r284, r282, r276 iand r284, r284, r279 ixor r284, r284, r276 ixor r285, r276, r280 iand r268, r285, r268 ixor r268, r268, r280 iadd r261, r261, r281 iadd r268, r261, r268 iadd r268, r268, r275 iadd r268, r268, r271 ixor r274, r267, r274 bitalign r274, r274, r274, r277 bitalign r261, r268, r268, r262 iadd r261, r280, r261 iadd r261, r261, r284 iadd r261, r261, r274 iadd r261, r261, r271 bitalign r279, r279, r279, r260 ixor r273, r259, r273 bitalign r273, r273, r273, r277 bitalign r280, r261, r261, r262 iadd r276, r276, r280 ixor r280, r279, r282 iand r280, r280, r268 ixor r280, r280, r282 iadd r276, r276, r280 iadd r276, r276, r273 iadd r271, r276, r271 bitalign r268, r268, r268, r260 bitalign r276, r275, r275, r277 bitalign r280, r271, r271, r262 bitalign r281, r261, r261, r260 ixor r284, r271, r281 ixor r284, r284, r268 ixor r261, r261, r268 ixor r261, r261, r279 iadd r280, r282, r280 iadd r261, r280, r261 iadd r261, r261, r276 mov r280, l41.xxxx mov r282, r280.xxxx iadd r282, r282.xyz0, r280.000x iadd r282, r282.xy0w, r280.00x0 iadd r280, r282.x0zw, r280.0x00 iadd r261, r261, r280 bitalign r282, r274, r274, r277 bitalign r285, r261, r261, r262 iadd r279, r279, r285 iadd r279, r279, r284 iadd r279, r279, r282 iadd r279, r279, r280 bitalign r271, r271, r271, r260 bitalign r284, r273, r273, r277 bitalign r285, r279, r279, r262 iadd r268, r268, r285 ixor r285, r261, r271 ixor r285, r285, r281 iadd r268, r268, r285 iadd r268, r268, r284 iadd r268, r268, r280 bitalign r261, r261, r261, r260 ixor r285, r276, r267 bitalign r285, r285, r285, r277 bitalign r286, r268, r268, r262 bitalign r287, r279, r279, r260 ixor r288, r268, r287 ixor r288, r288, r261 ixor r279, r279, r261 ixor r279, r279, r271 iadd r281, r281, r286 iadd r279, r281, r279 iadd r279, r279, r285 iadd r279, r279, r280 ixor r281, r282, r259 bitalign r281, r281, r281, r277 bitalign r286, r279, r279, r262 iadd r271, r271, r286 iadd r271, r271, r288 iadd r271, r271, r281 iadd r271, r271, r280 bitalign r268, r268, r268, r260 ixor r286, r284, r275 bitalign r286, r286, r286, r277 bitalign r288, r271, r271, r262 iadd r261, r261, r288 ixor r288, r279, r268 ixor r288, r288, r287 iadd r261, r261, r288 iadd r261, r261, r286 iadd r261, r261, r280 bitalign r279, r279, r279, r260 ixor r288, r285, r274 bitalign r288, r288, r288, r277 bitalign r289, r261, r261, r262 bitalign r290, r271, r271, r260 ixor r291, r261, r290 ixor r291, r291, r279 ixor r271, r271, r279 ixor r271, r271, r268 iadd r287, r287, r289 iadd r271, r287, r271 iadd r271, r271, r288 iadd r271, r271, r280 ixor r287, r281, r273 bitalign r287, r287, r287, r277 bitalign r289, r271, r271, r262 iadd r268, r268, r289 iadd r268, r268, r291 iadd r268, r268, r287 iadd r268, r268, r280 bitalign r261, r261, r261, r260 ixor r289, r286, r276 bitalign r289, r289, r289, r277 bitalign r291, r268, r268, r262 iadd r279, r279, r291 ixor r291, r271, r261 ixor r291, r291, r290 iadd r279, r279, r291 iadd r279, r279, r289 iadd r279, r279, r280 bitalign r271, r271, r271, r260 ixor r291, r288, r282 ixor r291, r291, r267 bitalign r291, r291, r291, r277 bitalign r292, r279, r279, r262 bitalign r293, r268, r268, r260 ixor r294, r279, r293 ixor r294, r294, r271 ixor r268, r268, r271 ixor r268, r268, r261 iadd r290, r290, r292 iadd r268, r290, r268 iadd r268, r268, r291 iadd r268, r268, r280 ixor r290, r287, r284 ixor r290, r290, r259 bitalign r290, r290, r290, r277 bitalign r292, r268, r268, r262 iadd r261, r261, r292 iadd r261, r261, r294 iadd r261, r261, r290 iadd r261, r261, r280 bitalign r279, r279, r279, r260 ixor r292, r289, r285 ixor r292, r292, r275 ixor r267, r292, r267 bitalign r267, r267, r267, r277 bitalign r292, r261, r261, r262 iadd r271, r271, r292 ixor r292, r268, r279 ixor r292, r292, r293 iadd r271, r271, r292 iadd r271, r271, r267 iadd r271, r271, r280 bitalign r268, r268, r268, r260 ixor r292, r291, r281 ixor r292, r292, r274 ixor r259, r292, r259 bitalign r259, r259, r259, r277 bitalign r292, r271, r271, r262 bitalign r294, r261, r261, r260 ixor r295, r271, r294 ixor r295, r295, r268 ixor r261, r261, r268 ixor r261, r261, r279 iadd r292, r293, r292 iadd r261, r292, r261 iadd r261, r261, r259 iadd r261, r261, r280 ixor r292, r290, r286 ixor r292, r292, r273 ixor r275, r292, r275 bitalign r275, r275, r275, r277 bitalign r292, r261, r261, r262 iadd r279, r279, r292 iadd r279, r279, r295 iadd r279, r279, r275 iadd r279, r279, r280 bitalign r271, r271, r271, r260 ixor r292, r267, r288 ixor r292, r292, r276 ixor r274, r292, r274 bitalign r274, r274, r274, r277 bitalign r292, r279, r279, r262 iadd r268, r268, r292 ixor r292, r261, r271 ixor r292, r292, r294 iadd r268, r268, r292 iadd r268, r268, r274 iadd r268, r268, r280 bitalign r261, r261, r261, r260 ixor r292, r259, r287 ixor r292, r292, r282 ixor r273, r292, r273 bitalign r273, r273, r273, r277 bitalign r292, r268, r268, r262 bitalign r293, r279, r279, r260 ixor r295, r268, r293 ixor r295, r295, r261 ixor r279, r279, r261 ixor r279, r279, r271 iadd r292, r294, r292 iadd r279, r292, r279 iadd r279, r279, r273 iadd r279, r279, r280 ixor r292, r275, r289 ixor r292, r292, r284 ixor r276, r292, r276 bitalign r276, r276, r276, r277 bitalign r292, r279, r279, r262 iadd r271, r271, r292 iadd r271, r271, r295 iadd r271, r271, r276 iadd r271, r271, r280 bitalign r268, r268, r268, r260 ixor r292, r274, r291 ixor r292, r292, r285 ixor r282, r292, r282 bitalign r282, r282, r282, r277 bitalign r292, r271, r271, r262 iadd r261, r261, r292 ixor r292, r279, r268 ixor r292, r292, r293 iadd r261, r261, r292 iadd r261, r261, r282 iadd r261, r261, r280 bitalign r279, r279, r279, r260 ixor r292, r273, r290 ixor r292, r292, r281 ixor r284, r292, r284 bitalign r284, r284, r284, r277 bitalign r292, r261, r261, r262 bitalign r294, r271, r271, r260 ixor r295, r261, r294 ixor r295, r295, r279 ixor r271, r271, r279 ixor r271, r271, r268 iadd r292, r293, r292 iadd r271, r292, r271 iadd r271, r271, r284 iadd r271, r271, r280 ixor r292, r276, r267 ixor r292, r292, r286 ixor r285, r292, r285 bitalign r285, r285, r285, r277 bitalign r292, r271, r271, r262 iadd r268, r268, r292 iadd r268, r268, r295 iadd r268, r268, r285 iadd r268, r268, r280 bitalign r261, r261, r261, r260 ixor r280, r282, r259 ixor r280, r280, r288 ixor r280, r280, r281 bitalign r280, r280, r280, r277 bitalign r281, r268, r268, r262 bitalign r292, r271, r271, r260 iand r293, r268, r292 ior r295, r268, r292 iand r295, r295, r261 ior r293, r293, r295 iand r295, r271, r261 ior r271, r271, r261 iand r271, r271, r294 ior r271, r295, r271 iadd r279, r279, r281 iadd r271, r279, r271 iadd r271, r271, r280 mov r279, l42.xxxx mov r281, r279.xxxx iadd r281, r281.xyz0, r279.000x iadd r281, r281.xy0w, r279.00x0 iadd r279, r281.x0zw, r279.0x00 iadd r271, r271, r279 ixor r281, r284, r275 ixor r281, r281, r287 ixor r281, r281, r286 bitalign r281, r281, r281, r277 bitalign r286, r271, r271, r262 iadd r286, r294, r286 iadd r286, r286, r293 iadd r286, r286, r281 iadd r286, r286, r279 bitalign r268, r268, r268, r260 ixor r293, r285, r274 ixor r293, r293, r289 ixor r288, r293, r288 bitalign r288, r288, r288, r277 bitalign r293, r286, r286, r262 bitalign r294, r271, r271, r260 iand r295, r286, r294 ior r296, r286, r294 iand r296, r296, r268 ior r295, r295, r296 iand r296, r271, r268 ior r271, r271, r268 iand r271, r271, r292 ior r271, r296, r271 iadd r261, r261, r293 iadd r271, r261, r271 iadd r271, r271, r288 iadd r271, r271, r279 ixor r261, r280, r273 ixor r261, r261, r291 ixor r261, r261, r287 bitalign r261, r261, r261, r277 bitalign r287, r271, r271, r262 iadd r287, r292, r287 iadd r287, r287, r295 iadd r287, r287, r261 iadd r287, r287, r279 bitalign r286, r286, r286, r260 ixor r292, r281, r276 ixor r292, r292, r290 ixor r289, r292, r289 bitalign r289, r289, r289, r277 bitalign r292, r287, r287, r262 bitalign r293, r271, r271, r260 iand r295, r287, r293 ior r296, r287, r293 iand r296, r296, r286 ior r295, r295, r296 iand r296, r271, r286 ior r271, r271, r286 iand r271, r271, r294 ior r271, r296, r271 iadd r268, r268, r292 iadd r268, r268, r271 iadd r268, r268, r289 iadd r268, r268, r279 ixor r271, r288, r282 ixor r271, r271, r267 ixor r271, r271, r291 bitalign r271, r271, r271, r277 bitalign r291, r268, r268, r262 iadd r291, r294, r291 iadd r291, r291, r295 iadd r291, r291, r271 iadd r291, r291, r279 bitalign r287, r287, r287, r260 ixor r292, r261, r284 ixor r292, r292, r259 ixor r290, r292, r290 bitalign r290, r290, r290, r277 bitalign r292, r291, r291, r262 bitalign r294, r268, r268, r260 iand r295, r291, r294 ior r296, r291, r294 iand r296, r296, r287 ior r295, r295, r296 iand r296, r268, r287 ior r268, r268, r287 iand r268, r268, r293 ior r268, r296, r268 iadd r286, r286, r292 iadd r268, r286, r268 iadd r268, r268, r290 iadd r268, r268, r279 ixor r286, r289, r285 ixor r286, r286, r275 ixor r267, r286, r267 bitalign r267, r267, r267, r277 bitalign r286, r268, r268, r262 iadd r286, r293, r286 iadd r286, r286, r295 iadd r286, r286, r267 iadd r286, r286, r279 bitalign r291, r291, r291, r260 ixor r292, r271, r280 ixor r292, r292, r274 ixor r259, r292, r259 bitalign r259, r259, r259, r277 bitalign r292, r286, r286, r262 bitalign r293, r268, r268, r260 iand r295, r286, r293 ior r296, r286, r293 iand r296, r296, r291 ior r295, r295, r296 iand r296, r268, r291 ior r268, r268, r291 iand r268, r268, r294 ior r268, r296, r268 iadd r287, r287, r292 iadd r268, r287, r268 iadd r268, r268, r259 iadd r268, r268, r279 ixor r287, r290, r281 ixor r287, r287, r273 ixor r275, r287, r275 bitalign r275, r275, r275, r277 bitalign r287, r268, r268, r262 iadd r287, r294, r287 iadd r287, r287, r295 iadd r287, r287, r275 iadd r287, r287, r279 bitalign r286, r286, r286, r260 ixor r292, r267, r288 ixor r292, r292, r276 ixor r274, r292, r274 bitalign r274, r274, r274, r277 bitalign r292, r287, r287, r262 bitalign r294, r268, r268, r260 iand r295, r287, r294 ior r296, r287, r294 iand r296, r296, r286 ior r295, r295, r296 iand r296, r268, r286 ior r268, r268, r286 iand r268, r268, r293 ior r268, r296, r268 iadd r291, r291, r292 iadd r268, r291, r268 iadd r268, r268, r274 iadd r268, r268, r279 ixor r291, r259, r261 ixor r291, r291, r282 ixor r273, r291, r273 bitalign r273, r273, r273, r277 bitalign r291, r268, r268, r262 iadd r291, r293, r291 iadd r291, r291, r295 iadd r291, r291, r273 iadd r291, r291, r279 bitalign r287, r287, r287, r260 ixor r292, r275, r289 ixor r292, r292, r284 ixor r276, r292, r276 bitalign r276, r276, r276, r277 bitalign r292, r291, r291, r262 bitalign r293, r268, r268, r260 iand r295, r291, r293 ior r296, r291, r293 iand r296, r296, r287 ior r295, r295, r296 iand r296, r268, r287 ior r268, r268, r287 iand r268, r268, r294 ior r268, r296, r268 iadd r286, r286, r292 iadd r268, r286, r268 iadd r268, r268, r276 iadd r268, r268, r279 ixor r286, r274, r271 ixor r286, r286, r285 ixor r282, r286, r282 bitalign r282, r282, r282, r277 bitalign r286, r268, r268, r262 iadd r286, r294, r286 iadd r286, r286, r295 iadd r286, r286, r282 iadd r286, r286, r279 bitalign r291, r291, r291, r260 ixor r292, r273, r290 ixor r292, r292, r280 ixor r284, r292, r284 bitalign r284, r284, r284, r277 bitalign r292, r286, r286, r262 bitalign r294, r268, r268, r260 iand r295, r286, r294 ior r296, r286, r294 iand r296, r296, r291 ior r295, r295, r296 iand r296, r268, r291 ior r268, r268, r291 iand r268, r268, r293 ior r268, r296, r268 iadd r287, r287, r292 iadd r268, r287, r268 iadd r268, r268, r284 iadd r268, r268, r279 ixor r287, r276, r267 ixor r287, r287, r281 ixor r285, r287, r285 bitalign r285, r285, r285, r277 bitalign r287, r268, r268, r262 iadd r287, r293, r287 iadd r287, r287, r295 iadd r287, r287, r285 iadd r287, r287, r279 bitalign r286, r286, r286, r260 ixor r292, r282, r259 ixor r292, r292, r288 ixor r280, r292, r280 bitalign r280, r280, r280, r277 bitalign r292, r287, r287, r262 bitalign r293, r268, r268, r260 iand r295, r287, r293 ior r296, r287, r293 iand r296, r296, r286 ior r295, r295, r296 iand r296, r268, r286 ior r268, r268, r286 iand r268, r268, r294 ior r268, r296, r268 iadd r291, r291, r292 iadd r268, r291, r268 iadd r268, r268, r280 iadd r268, r268, r279 ixor r291, r284, r275 ixor r291, r291, r261 ixor r281, r291, r281 bitalign r281, r281, r281, r277 bitalign r291, r268, r268, r262 iadd r291, r294, r291 iadd r291, r291, r295 iadd r291, r291, r281 iadd r291, r291, r279 bitalign r287, r287, r287, r260 ixor r292, r285, r274 ixor r292, r292, r289 ixor r288, r292, r288 bitalign r288, r288, r288, r277 bitalign r292, r291, r291, r262 bitalign r294, r268, r268, r260 iand r295, r291, r294 ior r296, r291, r294 iand r296, r296, r287 ior r295, r295, r296 iand r296, r268, r287 ior r268, r268, r287 iand r268, r268, r293 ior r268, r296, r268 iadd r286, r286, r292 iadd r268, r286, r268 iadd r268, r268, r288 iadd r268, r268, r279 ixor r286, r280, r273 ixor r286, r286, r271 ixor r261, r286, r261 bitalign r261, r261, r261, r277 bitalign r286, r268, r268, r262 iadd r286, r293, r286 iadd r286, r286, r295 iadd r286, r286, r261 iadd r279, r286, r279 bitalign r286, r291, r291, r260 ixor r291, r281, r276 ixor r291, r291, r290 ixor r289, r291, r289 bitalign r289, r289, r289, r277 bitalign r291, r279, r279, r262 iadd r287, r287, r291 ixor r291, r268, r286 ixor r291, r291, r294 iadd r287, r287, r291 iadd r287, r287, r289 mov r291, l43.xxxx mov r292, r291.xxxx iadd r292, r292.xyz0, r291.000x iadd r292, r292.xy0w, r291.00x0 iadd r291, r292.x0zw, r291.0x00 iadd r287, r287, r291 bitalign r268, r268, r268, r260 ixor r292, r288, r282 ixor r292, r292, r267 ixor r271, r292, r271 bitalign r271, r271, r271, r277 bitalign r292, r287, r287, r262 bitalign r293, r279, r279, r260 ixor r295, r287, r293 ixor r295, r295, r268 ixor r279, r279, r268 ixor r279, r279, r286 iadd r292, r294, r292 iadd r279, r292, r279 iadd r279, r279, r271 iadd r279, r279, r291 ixor r292, r261, r284 ixor r292, r292, r259 ixor r290, r292, r290 bitalign r290, r290, r290, r277 bitalign r292, r279, r279, r262 iadd r286, r286, r292 iadd r286, r286, r295 iadd r286, r286, r290 iadd r286, r286, r291 bitalign r287, r287, r287, r260 ixor r292, r289, r285 ixor r292, r292, r275 ixor r267, r292, r267 bitalign r267, r267, r267, r277 bitalign r292, r286, r286, r262 iadd r268, r268, r292 ixor r292, r279, r287 ixor r292, r292, r293 iadd r268, r268, r292 iadd r268, r268, r267 iadd r268, r268, r291 bitalign r279, r279, r279, r260 ixor r292, r271, r280 ixor r292, r292, r274 ixor r259, r292, r259 bitalign r259, r259, r259, r277 bitalign r292, r268, r268, r262 bitalign r294, r286, r286, r260 ixor r295, r268, r294 ixor r295, r295, r279 ixor r286, r286, r279 ixor r286, r286, r287 iadd r292, r293, r292 iadd r286, r292, r286 iadd r286, r286, r259 iadd r286, r286, r291 ixor r292, r290, r281 ixor r292, r292, r273 ixor r275, r292, r275 bitalign r275, r275, r275, r277 bitalign r292, r286, r286, r262 iadd r287, r287, r292 iadd r287, r287, r295 iadd r287, r287, r275 iadd r287, r287, r291 bitalign r268, r268, r268, r260 ixor r292, r267, r288 ixor r292, r292, r276 ixor r274, r292, r274 bitalign r274, r274, r274, r277 bitalign r292, r287, r287, r262 iadd r279, r279, r292 ixor r292, r286, r268 ixor r292, r292, r294 iadd r279, r279, r292 iadd r279, r279, r274 iadd r279, r279, r291 bitalign r286, r286, r286, r260 ixor r292, r259, r261 ixor r292, r292, r282 ixor r273, r292, r273 bitalign r273, r273, r273, r277 bitalign r292, r279, r279, r262 bitalign r293, r287, r287, r260 ixor r295, r279, r293 ixor r295, r295, r286 ixor r287, r287, r286 ixor r287, r287, r268 iadd r292, r294, r292 iadd r287, r292, r287 iadd r287, r287, r273 iadd r287, r287, r291 ixor r292, r275, r289 ixor r292, r292, r284 ixor r276, r292, r276 bitalign r276, r276, r276, r277 bitalign r292, r287, r287, r262 iadd r268, r268, r292 iadd r268, r268, r295 iadd r268, r268, r276 iadd r268, r268, r291 bitalign r279, r279, r279, r260 ixor r292, r274, r271 ixor r292, r292, r285 ixor r282, r292, r282 bitalign r282, r282, r282, r277 bitalign r292, r268, r268, r262 iadd r286, r286, r292 ixor r292, r287, r279 ixor r292, r292, r293 iadd r286, r286, r292 iadd r286, r286, r282 iadd r286, r286, r291 bitalign r287, r287, r287, r260 ixor r292, r273, r290 ixor r292, r292, r280 ixor r284, r292, r284 bitalign r284, r284, r284, r277 bitalign r292, r286, r286, r262 bitalign r294, r268, r268, r260 ixor r295, r286, r294 ixor r295, r295, r287 ixor r268, r268, r287 ixor r268, r268, r279 iadd r292, r293, r292 iadd r268, r292, r268 iadd r268, r268, r284 iadd r268, r268, r291 ixor r292, r276, r267 ixor r292, r292, r281 ixor r285, r292, r285 bitalign r285, r285, r285, r277 bitalign r292, r268, r268, r262 iadd r279, r279, r292 iadd r279, r279, r295 iadd r279, r279, r285 iadd r279, r279, r291 bitalign r286, r286, r286, r260 ixor r292, r282, r259 ixor r292, r292, r288 ixor r280, r292, r280 bitalign r280, r280, r280, r277 bitalign r292, r279, r279, r262 iadd r287, r287, r292 ixor r292, r268, r286 ixor r292, r292, r294 iadd r287, r287, r292 iadd r287, r287, r280 iadd r287, r287, r291 bitalign r268, r268, r268, r260 ixor r292, r284, r275 ixor r292, r292, r261 ixor r281, r292, r281 bitalign r281, r281, r281, r277 bitalign r292, r287, r287, r262 bitalign r293, r279, r279, r260 ixor r295, r287, r293 ixor r295, r295, r268 ixor r279, r279, r268 ixor r279, r279, r286 iadd r292, r294, r292 iadd r279, r292, r279 iadd r279, r279, r281 iadd r279, r279, r291 ixor r274, r285, r274 ixor r274, r274, r289 ixor r274, r274, r288 bitalign r274, r274, r274, r277 bitalign r288, r279, r279, r262 iadd r286, r286, r288 iadd r286, r286, r295 iadd r286, r286, r274 iadd r286, r286, r291 bitalign r287, r287, r287, r260 ixor r273, r280, r273 ixor r273, r273, r271 ixor r273, r273, r261 bitalign r273, r273, r273, r277 bitalign r261, r286, r286, r262 iadd r268, r268, r261 ixor r261, r279, r287 ixor r261, r261, r293 iadd r268, r268, r261 iadd r268, r268, r273 iadd r268, r268, r291 bitalign r261, r279, r279, r260 ixor r276, r281, r276 ixor r276, r276, r290 ixor r276, r276, r289 bitalign r276, r276, r276, r277 bitalign r279, r268, r268, r262 bitalign r280, r286, r286, r260 ixor r281, r268, r280 ixor r281, r281, r261 ixor r286, r286, r261 ixor r286, r286, r287 iadd r279, r293, r279 iadd r279, r279, r286 iadd r279, r279, r276 iadd r279, r279, r291 ixor r274, r274, r282 ixor r274, r274, r267 ixor r271, r274, r271 bitalign r271, r271, r271, r277 bitalign r274, r279, r279, r262 iadd r274, r287, r274 iadd r274, r274, r281 iadd r271, r274, r271 iadd r271, r271, r291 bitalign r268, r268, r268, r260 ixor r273, r273, r284 ixor r259, r273, r259 ixor r259, r259, r290 bitalign r259, r259, r259, r277 bitalign r273, r271, r271, r262 bitalign r274, r279, r279, r260 ixor r281, r271, r274 ixor r281, r281, r268 iadd r273, r261, r273 ixor r261, r279, r268 ixor r261, r261, r280 iadd r273, r273, r261 iadd r259, r273, r259 iadd r273, r259, r291 ixor r261, r276, r285 ixor r275, r261, r275 ixor r267, r275, r267 bitalign r267, r267, r267, r277 bitalign r262, r273, r273, r262 iadd r262, r280, r262 iadd r262, r262, r281 iadd r262, r262, r267 mov r267, l44.xxxx mov r273, r267.xxxx iadd r273, r273.xyz0, r267.000x iadd r273, r273.xy0w, r267.00x0 iadd r267, r273.x0zw, r267.0x00 iadd r262, r262, r267 bitalign r267, r271, r271, r260 bitalign r271, r262, r262, r255 bitalign r262, r262, r262, r266 mov r273, l45.xxxx mov r275, r273.xxxx iadd r275, r275.xyz0, r273.000x iadd r275, r275.xy0w, r273.00x0 iadd r273, r275.x0zw, r273.0x00 iadd r259, r259, r273 bitalign r273, r259, r259, r255 iand r273, r273, r265 bitalign r259, r259, r259, r266 iand r259, r259, r264 ior r259, r273, r259 mov r273, r259.x000 iand r275.x___, r273.xxxx, r269.xxxx mov r260, l31.xxxx ishl r275.x___, r260.xxxx, r275.xxxx iand r261.x___, r275.xxxx, r270.xxxx mov r276, l46.xxxx ine r261.x___, r261.xxxx, r276.xxxx mov r261.x___, r261.xxxx iadd r267, r267, r278 bitalign r277, r267, r267, r255 bitalign r267, r267, r267, r266 iadd r274, r274, r272 bitalign r272, r274, r274, r255 iand r272, r272, r265 bitalign r274, r274, r274, r266 iand r274, r274, r264 ior r274, r272, r274 mov r272, r274.x000 iand r278.x___, r272.xxxx, r269.xxxx ishl r278.x___, r260.xxxx, r278.xxxx iand r279.x___, r278.xxxx, r283.xxxx ine r279.x___, r279.xxxx, r276.xxxx mov r279.x___, r279.xxxx iand r261.x___, r261.xxxx, r279.xxxx iand r277, r277, r265 iand r267, r267, r264 ior r267, r277, r267 mov r277, r267.x000 iand r279.x___, r277.xxxx, r269.xxxx ishl r279.x___, r260.xxxx, r279.xxxx iand r280.x___, r279.xxxx, r254.xxxx ine r280.x___, r280.xxxx, r276.xxxx mov r280.x___, r280.xxxx iand r261.x___, r261.xxxx, r280.xxxx iand r261.x___, r261.xxxx, r260.xxxx iand r271, r271, r265 iand r262, r262, r264 ior r262, r271, r262 mov r271, r262.x000 mov r280, l47.xxxx mov r281, r280.xxxx iadd r281, r281.xyz0, r280.000x iadd r281, r281.xy0w, r280.00x0 iadd r280, r281.x0zw, r280.0x00 iadd r268, r268, r280 bitalign r255, r268, r268, r255 bitalign r266, r268, r268, r266 mov r268, l46.xxxx if_logicalnz r261.xxxx iand r261.x___, r271.xxxx, r269.xxxx ishl r261.x___, r260.xxxx, r261.xxxx mov r280, l48.xxxx iand r268.x___, r271.xxxx, r280.xxxx mov r281, l23.xxxx ushr r268.x___, r268.xxxx, r281.xxxx iadd r268.x___, r257.xxxx, r268.xxxx mov r1010.x___, r268.xxxx uav_raw_load_id(11) r1011.x___, r1010.xxxx mov r268.x___, r1011.xxxx iand r261.x___, r268.xxxx, r261.xxxx mov r268, l46.xxxx ieq r261.x___, r261.xxxx, r268.xxxx if_logicalnz r261.xxxx else iand r261.x___, r273.xxxx, r280.xxxx ushr r261.x___, r261.xxxx, r281.xxxx iadd r261.x___, r261.xxxx, r257.xxxx mov r268, l49.xxxx iadd r261.x___, r261.xxxx, r268.xxxx mov r1010.x___, r261.xxxx uav_raw_load_id(11) r1011.x___, r1010.xxxx mov r261.x___, r1011.xxxx iand r261.x___, r261.xxxx, r275.xxxx mov r268, l46.xxxx ieq r261.x___, r261.xxxx, r268.xxxx if_logicalnz r261.xxxx else mov r261, l48.xxxx iand r268.x___, r277.xxxx, r261.xxxx mov r275, l23.xxxx ushr r268.x___, r268.xxxx, r275.xxxx iadd r268.x___, r268.xxxx, r257.xxxx mov r280, l50.xxxx iadd r268.x___, r268.xxxx, r280.xxxx mov r1010.x___, r268.xxxx uav_raw_load_id(11) r1011.x___, r1010.xxxx mov r268.x___, r1011.xxxx iand r280.x___, r268.xxxx, r279.xxxx mov r268, l46.xxxx ieq r280.x___, r280.xxxx, r268.xxxx if_logicalnz r280.xxxx else iand r261.x___, r272.xxxx, r261.xxxx ushr r261.x___, r261.xxxx, r275.xxxx iadd r261.x___, r261.xxxx, r257.xxxx mov r268, l51.xxxx iadd r261.x___, r261.xxxx, r268.xxxx mov r1010.x___, r261.xxxx uav_raw_load_id(11) r1011.x___, r1010.xxxx mov r261.x___, r1011.xxxx iand r261.x___, r261.xxxx, r278.xxxx mov r268, l46.xxxx ieq r261.x___, r261.xxxx, r268.xxxx if_logicalnz r261.xxxx else mov r268, l31.xxxx endif endif endif endif else endif mov r275, r274.y000 iand r278.x___, r275.xxxx, r269.xxxx ishl r278.x___, r260.xxxx, r278.xxxx iand r279.x___, r278.xxxx, r283.xxxx ine r279.x___, r279.xxxx, r276.xxxx mov r279.x___, r279.xxxx mov r261, r259.y000 iand r280.x___, r261.xxxx, r269.xxxx ishl r280.x___, r260.xxxx, r280.xxxx iand r281.x___, r280.xxxx, r270.xxxx ine r281.x___, r281.xxxx, r276.xxxx mov r281.x___, r281.xxxx iand r279.x___, r281.xxxx, r279.xxxx mov r281, r267.y000 iand r269.x___, r281.xxxx, r269.xxxx ishl r269.x___, r260.xxxx, r269.xxxx iand r282.x___, r269.xxxx, r254.xxxx ine r276.x___, r282.xxxx, r276.xxxx mov r276.x___, r276.xxxx iand r276.x___, r279.xxxx, r276.xxxx iand r260.x___, r276.xxxx, r260.xxxx mov r276, r262.y000 if_logicalnz r260.xxxx mov r260, l40.xxxx iand r260.x___, r276.xxxx, r260.xxxx mov r282, l31.xxxx ishl r260.x___, r282.xxxx, r260.xxxx mov r282, l48.xxxx iand r284.x___, r276.xxxx, r282.xxxx mov r285, l23.xxxx ushr r284.x___, r284.xxxx, r285.xxxx iadd r284.x___, r257.xxxx, r284.xxxx mov r1010.x___, r284.xxxx uav_raw_load_id(11) r1011.x___, r1010.xxxx mov r284.x___, r1011.xxxx iand r260.x___, r284.xxxx, r260.xxxx mov r284, l46.xxxx ieq r260.x___, r260.xxxx, r284.xxxx if_logicalnz r260.xxxx else iand r260.x___, r261.xxxx, r282.xxxx ushr r260.x___, r260.xxxx, r285.xxxx iadd r260.x___, r260.xxxx, r257.xxxx mov r282, l49.xxxx iadd r260.x___, r260.xxxx, r282.xxxx mov r1010.x___, r260.xxxx uav_raw_load_id(11) r1011.x___, r1010.xxxx mov r260.x___, r1011.xxxx iand r260.x___, r260.xxxx, r280.xxxx ieq r260.x___, r260.xxxx, r284.xxxx if_logicalnz r260.xxxx else mov r260, l48.xxxx iand r280.x___, r281.xxxx, r260.xxxx mov r282, l23.xxxx ushr r280.x___, r280.xxxx, r282.xxxx iadd r280.x___, r280.xxxx, r257.xxxx mov r284, l50.xxxx iadd r280.x___, r280.xxxx, r284.xxxx mov r1010.x___, r280.xxxx uav_raw_load_id(11) r1011.x___, r1010.xxxx mov r280.x___, r1011.xxxx iand r269.x___, r280.xxxx, r269.xxxx mov r280, l46.xxxx ieq r269.x___, r269.xxxx, r280.xxxx if_logicalnz r269.xxxx else iand r260.x___, r275.xxxx, r260.xxxx ushr r260.x___, r260.xxxx, r282.xxxx iadd r260.x___, r260.xxxx, r257.xxxx mov r269, l51.xxxx iadd r260.x___, r260.xxxx, r269.xxxx mov r1010.x___, r260.xxxx uav_raw_load_id(11) r1011.x___, r1010.xxxx mov r260.x___, r1011.xxxx iand r260.x___, r260.xxxx, r278.xxxx ieq r260.x___, r260.xxxx, r280.xxxx if_logicalnz r260.xxxx else mov r268, l31.xxxx endif endif endif endif else endif mov r269, r274.z000 mov r278, l40.xxxx iand r280.x___, r269.xxxx, r278.xxxx mov r260, l31.xxxx ishl r280.x___, r260.xxxx, r280.xxxx iand r279.x___, r280.xxxx, r283.xxxx mov r282, l46.xxxx ine r279.x___, r279.xxxx, r282.xxxx mov r279.x___, r279.xxxx mov r284, r259.z000 iand r285.x___, r284.xxxx, r278.xxxx ishl r285.x___, r260.xxxx, r285.xxxx iand r286.x___, r285.xxxx, r270.xxxx ine r286.x___, r286.xxxx, r282.xxxx mov r286.x___, r286.xxxx iand r279.x___, r286.xxxx, r279.xxxx mov r286, r267.z000 iand r287.x___, r286.xxxx, r278.xxxx ishl r287.x___, r260.xxxx, r287.xxxx iand r288.x___, r287.xxxx, r254.xxxx ine r288.x___, r288.xxxx, r282.xxxx mov r288.x___, r288.xxxx iand r279.x___, r279.xxxx, r288.xxxx iand r279.x___, r279.xxxx, r260.xxxx mov r288, r262.z000 if_logicalnz r279.xxxx iand r279.x___, r288.xxxx, r278.xxxx ishl r279.x___, r260.xxxx, r279.xxxx mov r289, l48.xxxx iand r290.x___, r288.xxxx, r289.xxxx mov r291, l23.xxxx ushr r290.x___, r290.xxxx, r291.xxxx iadd r290.x___, r257.xxxx, r290.xxxx mov r1010.x___, r290.xxxx uav_raw_load_id(11) r1011.x___, r1010.xxxx mov r290.x___, r1011.xxxx iand r279.x___, r290.xxxx, r279.xxxx ieq r279.x___, r279.xxxx, r282.xxxx if_logicalnz r279.xxxx else iand r279.x___, r284.xxxx, r289.xxxx ushr r279.x___, r279.xxxx, r291.xxxx iadd r279.x___, r279.xxxx, r257.xxxx mov r289, l49.xxxx iadd r279.x___, r279.xxxx, r289.xxxx mov r1010.x___, r279.xxxx uav_raw_load_id(11) r1011.x___, r1010.xxxx mov r279.x___, r1011.xxxx iand r279.x___, r279.xxxx, r285.xxxx mov r285, l46.xxxx ieq r279.x___, r279.xxxx, r285.xxxx if_logicalnz r279.xxxx else mov r279, l48.xxxx iand r289.x___, r286.xxxx, r279.xxxx mov r290, l23.xxxx ushr r289.x___, r289.xxxx, r290.xxxx iadd r289.x___, r289.xxxx, r257.xxxx mov r291, l50.xxxx iadd r289.x___, r289.xxxx, r291.xxxx mov r1010.x___, r289.xxxx uav_raw_load_id(11) r1011.x___, r1010.xxxx mov r289.x___, r1011.xxxx iand r287.x___, r289.xxxx, r287.xxxx ieq r285.x___, r287.xxxx, r285.xxxx if_logicalnz r285.xxxx else iand r279.x___, r269.xxxx, r279.xxxx ushr r279.x___, r279.xxxx, r290.xxxx iadd r279.x___, r279.xxxx, r257.xxxx mov r285, l51.xxxx iadd r279.x___, r279.xxxx, r285.xxxx mov r1010.x___, r279.xxxx uav_raw_load_id(11) r1011.x___, r1010.xxxx mov r279.x___, r1011.xxxx iand r279.x___, r279.xxxx, r280.xxxx mov r280, l46.xxxx ieq r279.x___, r279.xxxx, r280.xxxx if_logicalnz r279.xxxx else mov r268, l31.xxxx endif endif endif endif else endif iand r255, r255, r265 iand r264, r266, r264 ior r255, r255, r264 mov r264, r274.w000 iand r265.x___, r264.xxxx, r278.xxxx ishl r265.x___, r260.xxxx, r265.xxxx iand r266.x___, r265.xxxx, r283.xxxx ine r266.x___, r266.xxxx, r282.xxxx mov r266.x___, r266.xxxx mov r259, r259.w000 iand r274.x___, r259.xxxx, r278.xxxx ishl r274.x___, r260.xxxx, r274.xxxx iand r270.x___, r274.xxxx, r270.xxxx ine r270.x___, r270.xxxx, r282.xxxx mov r270.x___, r270.xxxx iand r266.x___, r270.xxxx, r266.xxxx mov r267, r267.w000 iand r270.x___, r267.xxxx, r278.xxxx ishl r270.x___, r260.xxxx, r270.xxxx iand r254.x___, r270.xxxx, r254.xxxx ine r254.x___, r254.xxxx, r282.xxxx mov r254.x___, r254.xxxx iand r254.x___, r266.xxxx, r254.xxxx iand r254.x___, r254.xxxx, r260.xxxx mov r260, r262.w000 if_logicalnz r254.xxxx mov r254, l40.xxxx iand r254.x___, r260.xxxx, r254.xxxx mov r262, l31.xxxx ishl r254.x___, r262.xxxx, r254.xxxx mov r262, l48.xxxx iand r266.x___, r260.xxxx, r262.xxxx mov r278, l23.xxxx ushr r266.x___, r266.xxxx, r278.xxxx iadd r266.x___, r257.xxxx, r266.xxxx mov r1010.x___, r266.xxxx uav_raw_load_id(11) r1011.x___, r1010.xxxx mov r266.x___, r1011.xxxx iand r254.x___, r266.xxxx, r254.xxxx mov r266, l46.xxxx ieq r254.x___, r254.xxxx, r266.xxxx if_logicalnz r254.xxxx mov r3724, l31.xxxx else iand r254.x___, r259.xxxx, r262.xxxx ushr r254.x___, r254.xxxx, r278.xxxx iadd r254.x___, r254.xxxx, r257.xxxx mov r262, l49.xxxx iadd r254.x___, r254.xxxx, r262.xxxx mov r1010.x___, r254.xxxx uav_raw_load_id(11) r1011.x___, r1010.xxxx mov r254.x___, r1011.xxxx iand r254.x___, r254.xxxx, r274.xxxx ieq r254.x___, r254.xxxx, r266.xxxx if_logicalnz r254.xxxx mov r3724, l31.xxxx else mov r3724, l46.xxxx mov r254, l48.xxxx iand r262.x___, r267.xxxx, r254.xxxx mov r266, l23.xxxx ushr r262.x___, r262.xxxx, r266.xxxx iadd r262.x___, r262.xxxx, r257.xxxx mov r274, l50.xxxx iadd r262.x___, r262.xxxx, r274.xxxx mov r1010.x___, r262.xxxx uav_raw_load_id(11) r1011.x___, r1010.xxxx mov r262.x___, r1011.xxxx iand r262.x___, r262.xxxx, r270.xxxx mov r270, l46.xxxx ieq r262.x___, r262.xxxx, r270.xxxx if_logicalnz r262.xxxx mov r3724, l31.xxxx else iand r254.x___, r264.xxxx, r254.xxxx ushr r254.x___, r254.xxxx, r266.xxxx iadd r254.x___, r254.xxxx, r257.xxxx mov r257, l51.xxxx iadd r254.x___, r254.xxxx, r257.xxxx mov r1010.x___, r254.xxxx uav_raw_load_id(11) r1011.x___, r1010.xxxx mov r254.x___, r1011.xxxx iand r254.x___, r254.xxxx, r265.xxxx ieq r254.x___, r254.xxxx, r270.xxxx if_logicalnz r254.xxxx mov r257, l46.xxxx ieq r257.x___, r268.xxxx, r257.xxxx if_logicalnz r257.xxxx else mov r257, l31.xxxx ieq r257.x___, r268.xxxx, r257.xxxx if_logicalnz r257.xxxx mov r254, l31.xxxx mov r1011.x___, r254.xxxx mov r1010.x___, r256.xxxx uav_raw_store_id(11) mem.x___, r1010.xxxx, r1011.xxxx mov r256, r1021.xyz0 mov r256, r256.x000 mov r257, l12.xxxx ishl r256.x___, r256.xxxx, r257.xxxx iadd r256.x___, r258.xxxx, r256.xxxx mov r1011.x___, r254.xxxx mov r1010.x___, r256.xxxx uav_raw_store_id(11) mem.x___, r1010.xxxx, r1011.xxxx else endif mov r256, r271.xxxx iadd r256, r256.x0zw, r273.0x00 mov r258, r256.x000 mov r258, r258.xxxx iadd r258, r258.xy0w, r277.00x0 mov r256, r256.y000 iadd r256, r258.x0zw, r256.0x00 mov r258, r256.x000 mov r258, r258.xxxx iadd r258, r258.xyz0, r272.000x mov r271, r256.z000 iadd r258, r258.xy0w, r271.00x0 mov r256, r256.y000 iadd r256, r258.x0zw, r256.0x00 mov r258, r1021.xyz0 mov r258, r258.x000 mov r271, l21.xxxx imul r258.x___, r258.xxxx, r271.xxxx mov r272, l30.xxxx ishl r258.x___, r258.xxxx, r272.xxxx iadd r258.x___, r258.xxxx, r263.xxxx mov r273, l54.xxxx iadd r258.x___, r258.xxxx, r273.xxxx mov r1011, r256 mov r1010.x___, r258.xxxx uav_raw_store_id(11) mem, r1010.xxxx, r1011 mov r256, r255.x000 mov r256, r256.xxxx iadd r256, r256.x0zw, r276.0x00 mov r258, r256.x000 mov r258, r258.xxxx iadd r258, r258.xy0w, r261.00x0 mov r256, r256.y000 iadd r256, r258.x0zw, r256.0x00 mov r258, r256.x000 mov r258, r258.xxxx iadd r258, r258.xyz0, r281.000x mov r261, r256.z000 iadd r258, r258.xy0w, r261.00x0 mov r256, r256.y000 iadd r256, r258.x0zw, r256.0x00 mov r258, r1021.xyz0 mov r258, r258.x000 imul r258.x___, r258.xxxx, r271.xxxx mov r261, l31.xxxx ior r258.x___, r258.xxxx, r261.xxxx ishl r258.x___, r258.xxxx, r272.xxxx iadd r258.x___, r258.xxxx, r263.xxxx iadd r258.x___, r258.xxxx, r273.xxxx mov r1011, r256 mov r1010.x___, r258.xxxx uav_raw_store_id(11) mem, r1010.xxxx, r1011 mov r256, r255.y000 mov r258, r275.xxxx iadd r256, r258.x0zw, r256.0x00 mov r258, r256.x000 mov r258, r258.xxxx iadd r258, r258.xy0w, r288.00x0 mov r256, r256.y000 iadd r256, r258.x0zw, r256.0x00 mov r258, r256.x000 mov r258, r258.xxxx iadd r258, r258.xyz0, r284.000x mov r261, r256.z000 iadd r258, r258.xy0w, r261.00x0 mov r256, r256.y000 iadd r256, r258.x0zw, r256.0x00 mov r258, r1021.xyz0 mov r258, r258.x000 imul r258.x___, r258.xxxx, r271.xxxx ishl r258.x___, r258.xxxx, r272.xxxx iadd r258.x___, r258.xxxx, r263.xxxx mov r261, l55.xxxx iadd r258.x___, r258.xxxx, r261.xxxx mov r1011, r256 mov r1010.x___, r258.xxxx uav_raw_store_id(11) mem, r1010.xxxx, r1011 mov r256, r286.xxxx iadd r256, r256.x0zw, r269.0x00 mov r258, r256.x000 mov r258, r258.xxxx mov r261, r255.z000 iadd r258, r258.xy0w, r261.00x0 mov r256, r256.y000 iadd r256, r258.x0zw, r256.0x00 mov r258, r256.x000 mov r258, r258.xxxx iadd r258, r258.xyz0, r260.000x mov r260, r256.z000 iadd r258, r258.xy0w, r260.00x0 mov r256, r256.y000 iadd r256, r258.x0zw, r256.0x00 mov r258, r1021.xyz0 mov r258, r258.x000 imul r258.x___, r258.xxxx, r271.xxxx ishl r258.x___, r258.xxxx, r272.xxxx iadd r258.x___, r258.xxxx, r263.xxxx mov r260, l29.xxxx iadd r258.x___, r258.xxxx, r260.xxxx mov r1011, r256 mov r1010.x___, r258.xxxx uav_raw_store_id(11) mem, r1010.xxxx, r1011 mov r256, r259.xxxx iadd r256, r256.x0zw, r267.0x00 mov r258, r256.x000 mov r258, r258.xxxx iadd r258, r258.xy0w, r264.00x0 mov r256, r256.y000 iadd r256, r258.x0zw, r256.0x00 mov r258, r256.x000 mov r258, r258.xxxx mov r255, r255.w000 iadd r255, r258.xyz0, r255.000x mov r258, r256.z000 iadd r255, r255.xy0w, r258.00x0 mov r256, r256.y000 iadd r255, r255.x0zw, r256.0x00 mov r256, r1021.xyz0 mov r256, r256.x000 imul r256.x___, r256.xxxx, r271.xxxx ishl r256.x___, r256.xxxx, r272.xxxx iadd r256.x___, r256.xxxx, r263.xxxx mov r258, l56.xxxx iadd r256.x___, r256.xxxx, r258.xxxx mov r1011, r255 mov r1010.x___, r256.xxxx uav_raw_store_id(11) mem, r1010.xxxx, r1011 endif else mov r254, l31.xxxx mov r1011.x___, r254.xxxx mov r1010.x___, r256.xxxx uav_raw_store_id(11) mem.x___, r1010.xxxx, r1011.xxxx mov r256, r1021.xyz0 mov r256, r256.x000 mov r257, l12.xxxx ishl r256.x___, r256.xxxx, r257.xxxx iadd r256.x___, r258.xxxx, r256.xxxx mov r1011.x___, r254.xxxx mov r1010.x___, r256.xxxx uav_raw_store_id(11) mem.x___, r1010.xxxx, r1011.xxxx mov r256, r271.xxxx iadd r256, r256.x0zw, r273.0x00 mov r258, r256.x000 mov r258, r258.xxxx iadd r258, r258.xy0w, r277.00x0 mov r256, r256.y000 iadd r256, r258.x0zw, r256.0x00 mov r258, r256.x000 mov r258, r258.xxxx iadd r258, r258.xyz0, r272.000x mov r271, r256.z000 iadd r258, r258.xy0w, r271.00x0 mov r256, r256.y000 iadd r256, r258.x0zw, r256.0x00 mov r258, r1021.xyz0 mov r258, r258.x000 mov r271, l21.xxxx imul r258.x___, r258.xxxx, r271.xxxx mov r272, l30.xxxx ishl r258.x___, r258.xxxx, r272.xxxx iadd r258.x___, r258.xxxx, r263.xxxx mov r273, l54.xxxx iadd r258.x___, r258.xxxx, r273.xxxx mov r1011, r256 mov r1010.x___, r258.xxxx uav_raw_store_id(11) mem, r1010.xxxx, r1011 mov r256, r255.x000 mov r256, r256.xxxx iadd r256, r256.x0zw, r276.0x00 mov r258, r256.x000 mov r258, r258.xxxx iadd r258, r258.xy0w, r261.00x0 mov r256, r256.y000 iadd r256, r258.x0zw, r256.0x00 mov r258, r256.x000 mov r258, r258.xxxx iadd r258, r258.xyz0, r281.000x mov r261, r256.z000 iadd r258, r258.xy0w, r261.00x0 mov r256, r256.y000 iadd r256, r258.x0zw, r256.0x00 mov r258, r1021.xyz0 mov r258, r258.x000 imul r258.x___, r258.xxxx, r271.xxxx mov r261, l31.xxxx ior r258.x___, r258.xxxx, r261.xxxx ishl r258.x___, r258.xxxx, r272.xxxx iadd r258.x___, r258.xxxx, r263.xxxx iadd r258.x___, r258.xxxx, r273.xxxx mov r1011, r256 mov r1010.x___, r258.xxxx uav_raw_store_id(11) mem, r1010.xxxx, r1011 mov r256, r255.y000 mov r258, r275.xxxx iadd r256, r258.x0zw, r256.0x00 mov r258, r256.x000 mov r258, r258.xxxx iadd r258, r258.xy0w, r288.00x0 mov r256, r256.y000 iadd r256, r258.x0zw, r256.0x00 mov r258, r256.x000 mov r258, r258.xxxx iadd r258, r258.xyz0, r284.000x mov r261, r256.z000 iadd r258, r258.xy0w, r261.00x0 mov r256, r256.y000 iadd r256, r258.x0zw, r256.0x00 mov r258, r1021.xyz0 mov r258, r258.x000 imul r258.x___, r258.xxxx, r271.xxxx ishl r258.x___, r258.xxxx, r272.xxxx iadd r258.x___, r258.xxxx, r263.xxxx mov r261, l55.xxxx iadd r258.x___, r258.xxxx, r261.xxxx mov r1011, r256 mov r1010.x___, r258.xxxx uav_raw_store_id(11) mem, r1010.xxxx, r1011 mov r256, r286.xxxx iadd r256, r256.x0zw, r269.0x00 mov r258, r256.x000 mov r258, r258.xxxx mov r261, r255.z000 iadd r258, r258.xy0w, r261.00x0 mov r256, r256.y000 iadd r256, r258.x0zw, r256.0x00 mov r258, r256.x000 mov r258, r258.xxxx iadd r258, r258.xyz0, r260.000x mov r260, r256.z000 iadd r258, r258.xy0w, r260.00x0 mov r256, r256.y000 iadd r256, r258.x0zw, r256.0x00 mov r258, r1021.xyz0 mov r258, r258.x000 imul r258.x___, r258.xxxx, r271.xxxx ishl r258.x___, r258.xxxx, r272.xxxx iadd r258.x___, r258.xxxx, r263.xxxx mov r260, l29.xxxx iadd r258.x___, r258.xxxx, r260.xxxx mov r1011, r256 mov r1010.x___, r258.xxxx uav_raw_store_id(11) mem, r1010.xxxx, r1011 mov r256, r259.xxxx iadd r256, r256.x0zw, r267.0x00 mov r258, r256.x000 mov r258, r258.xxxx iadd r258, r258.xy0w, r264.00x0 mov r256, r256.y000 iadd r256, r258.x0zw, r256.0x00 mov r258, r256.x000 mov r258, r258.xxxx mov r255, r255.w000 iadd r255, r258.xyz0, r255.000x mov r258, r256.z000 iadd r255, r255.xy0w, r258.00x0 mov r256, r256.y000 iadd r255, r255.x0zw, r256.0x00 mov r256, r1021.xyz0 mov r256, r256.x000 imul r256.x___, r256.xxxx, r271.xxxx ishl r256.x___, r256.xxxx, r272.xxxx iadd r256.x___, r256.xxxx, r263.xxxx mov r258, l56.xxxx iadd r256.x___, r256.xxxx, r258.xxxx mov r1011, r255 mov r1010.x___, r256.xxxx uav_raw_store_id(11) mem, r1010.xxxx, r1011 endif endif endif endif else mov r3724, l31.xxxx endif if_logicalnz r3724.xxxx mov r257, l46.xxxx ieq r257.x___, r268.xxxx, r257.xxxx if_logicalnz r257.xxxx else mov r257, l31.xxxx ieq r257.x___, r268.xxxx, r257.xxxx if_logicalnz r257.xxxx mov r254, l31.xxxx mov r1011.x___, r254.xxxx mov r1010.x___, r256.xxxx uav_raw_store_id(11) mem.x___, r1010.xxxx, r1011.xxxx mov r256, r1021.xyz0 mov r256, r256.x000 mov r257, l12.xxxx ishl r256.x___, r256.xxxx, r257.xxxx iadd r256.x___, r258.xxxx, r256.xxxx mov r1011.x___, r254.xxxx mov r1010.x___, r256.xxxx uav_raw_store_id(11) mem.x___, r1010.xxxx, r1011.xxxx else endif mov r256, r271.xxxx iadd r256, r256.x0zw, r273.0x00 mov r258, r256.x000 mov r258, r258.xxxx iadd r258, r258.xy0w, r277.00x0 mov r256, r256.y000 iadd r256, r258.x0zw, r256.0x00 mov r258, r256.x000 mov r258, r258.xxxx iadd r258, r258.xyz0, r272.000x mov r271, r256.z000 iadd r258, r258.xy0w, r271.00x0 mov r256, r256.y000 iadd r256, r258.x0zw, r256.0x00 mov r258, r1021.xyz0 mov r258, r258.x000 mov r271, l21.xxxx imul r258.x___, r258.xxxx, r271.xxxx mov r272, l30.xxxx ishl r258.x___, r258.xxxx, r272.xxxx iadd r258.x___, r258.xxxx, r263.xxxx mov r273, l54.xxxx iadd r258.x___, r258.xxxx, r273.xxxx mov r1011, r256 mov r1010.x___, r258.xxxx uav_raw_store_id(11) mem, r1010.xxxx, r1011 mov r256, r255.x000 mov r256, r256.xxxx iadd r256, r256.x0zw, r276.0x00 mov r258, r256.x000 mov r258, r258.xxxx iadd r258, r258.xy0w, r261.00x0 mov r256, r256.y000 iadd r256, r258.x0zw, r256.0x00 mov r258, r256.x000 mov r258, r258.xxxx iadd r258, r258.xyz0, r281.000x mov r261, r256.z000 iadd r258, r258.xy0w, r261.00x0 mov r256, r256.y000 iadd r256, r258.x0zw, r256.0x00 mov r258, r1021.xyz0 mov r258, r258.x000 imul r258.x___, r258.xxxx, r271.xxxx mov r261, l31.xxxx ior r258.x___, r258.xxxx, r261.xxxx ishl r258.x___, r258.xxxx, r272.xxxx iadd r258.x___, r258.xxxx, r263.xxxx iadd r258.x___, r258.xxxx, r273.xxxx mov r1011, r256 mov r1010.x___, r258.xxxx uav_raw_store_id(11) mem, r1010.xxxx, r1011 mov r256, r255.y000 mov r258, r275.xxxx iadd r256, r258.x0zw, r256.0x00 mov r258, r256.x000 mov r258, r258.xxxx iadd r258, r258.xy0w, r288.00x0 mov r256, r256.y000 iadd r256, r258.x0zw, r256.0x00 mov r258, r256.x000 mov r258, r258.xxxx iadd r258, r258.xyz0, r284.000x mov r261, r256.z000 iadd r258, r258.xy0w, r261.00x0 mov r256, r256.y000 iadd r256, r258.x0zw, r256.0x00 mov r258, r1021.xyz0 mov r258, r258.x000 imul r258.x___, r258.xxxx, r271.xxxx ishl r258.x___, r258.xxxx, r272.xxxx iadd r258.x___, r258.xxxx, r263.xxxx mov r261, l55.xxxx iadd r258.x___, r258.xxxx, r261.xxxx mov r1011, r256 mov r1010.x___, r258.xxxx uav_raw_store_id(11) mem, r1010.xxxx, r1011 mov r256, r286.xxxx iadd r256, r256.x0zw, r269.0x00 mov r258, r256.x000 mov r258, r258.xxxx mov r261, r255.z000 iadd r258, r258.xy0w, r261.00x0 mov r256, r256.y000 iadd r256, r258.x0zw, r256.0x00 mov r258, r256.x000 mov r258, r258.xxxx iadd r258, r258.xyz0, r260.000x mov r260, r256.z000 iadd r258, r258.xy0w, r260.00x0 mov r256, r256.y000 iadd r256, r258.x0zw, r256.0x00 mov r258, r1021.xyz0 mov r258, r258.x000 imul r258.x___, r258.xxxx, r271.xxxx ishl r258.x___, r258.xxxx, r272.xxxx iadd r258.x___, r258.xxxx, r263.xxxx mov r260, l29.xxxx iadd r258.x___, r258.xxxx, r260.xxxx mov r1011, r256 mov r1010.x___, r258.xxxx uav_raw_store_id(11) mem, r1010.xxxx, r1011 mov r256, r259.xxxx iadd r256, r256.x0zw, r267.0x00 mov r258, r256.x000 mov r258, r258.xxxx iadd r258, r258.xy0w, r264.00x0 mov r256, r256.y000 iadd r256, r258.x0zw, r256.0x00 mov r258, r256.x000 mov r258, r258.xxxx mov r255, r255.w000 iadd r255, r258.xyz0, r255.000x mov r258, r256.z000 iadd r255, r255.xy0w, r258.00x0 mov r256, r256.y000 iadd r255, r255.x0zw, r256.0x00 mov r256, r1021.xyz0 mov r256, r256.x000 imul r256.x___, r256.xxxx, r271.xxxx ishl r256.x___, r256.xxxx, r272.xxxx iadd r256.x___, r256.xxxx, r263.xxxx mov r258, l56.xxxx iadd r256.x___, r256.xxxx, r258.xxxx mov r1011, r255 mov r1010.x___, r256.xxxx uav_raw_store_id(11) mem, r1010.xxxx, r1011 endif else endif ret endfunc ; sha1_short ;ARGSTART:sha1_short ;uniqueid:1028 ;memory:hwregion:0 ;memory:hwlocal:0 ;ARGEND:sha1_short end

                                                                                                      • Can I use BFI_INT directly from IL ?
                                                                                                        gat3way

                                                                                                        That's the corresponding ISA (I cut everything after the first BFI_INT occurences since pasting that long text here somehow crashes my browser lol)

                                                                                                         

                                                                                                        P.S in  case you are wondering, that's catalyst 11.9/SDK2.5 running on linux, x86

                                                                                                        TargetChip = c ; ------------- SC_SRCSHADER Dump ------------------ SC_SHADERSTATE: u32NumIntVSConst = 0 SC_SHADERSTATE: u32NumIntPSConst = 0 SC_SHADERSTATE: u32NumIntGSConst = 0 SC_SHADERSTATE: u32NumBoolVSConst = 0 SC_SHADERSTATE: u32NumBoolPSConst = 0 SC_SHADERSTATE: u32NumBoolGSConst = 0 SC_SHADERSTATE: u32NumFloatVSConst = 0 SC_SHADERSTATE: u32NumFloatPSConst = 0 SC_SHADERSTATE: u32NumFloatGSConst = 0 fConstantsAvailable = 0 iConstantsAvailable = 0 bConstantsAvailable = 0 u32SCOptions[0] = 0x01A00000 SCOption_IGNORE_SAMPLE_L_BUG SCOption_FLOAT_DO_NOT_DIST SCOption_FLOAT_DO_NOT_REASSOC u32SCOptions[1] = 0x00202000 SCOption_R600_ERROR_ON_DOUBLE_MEMEXP SCOption_SET_VPM_FOR_SCATTER u32SCOptions[2] = 0x00000045 SCOption_R800_UAV_NONARRAY_FIXUP SCOption_R8XX_CF_ALU_STACK_ENTRY_WORKAROUND SCOption_R800_UAV_NONUAV_SYNC_WORKAROUND_BUG216513_1 ; -------- Disassembly -------------------- 00 ALU: ADDR(544) CNT(37) KCACHE0(CB1:0-15) KCACHE1(CB0:0-15) 0 x: MOV R25.x, KC0[7].x y: LSHL T0.y, KC0[2].x, 3 w: SETGT_INT R24.w, KC0[2].x, 5 t: MOV R25.y, 0.0f 1 x: MOV R26.x, KC0[5].x y: MOV R2.y, KC0[5].y z: MOV R2.z, KC0[5].z w: MOV R2.w, KC0[5].w t: LSHR R27.x, PV0.x, 2 2 y: MOV R2.y, KC0[7].y z: MOV R2.z, KC0[7].z w: MOV R2.w, KC0[7].w t: ADD_INT R23.z, T0.y, 32 3 x: MOV R28.x, KC0[1].w y: MOV R29.y, KC0[1].z z: MOV R29.z, KC0[1].z w: MOV R29.w, KC0[1].z t: MOV R29.x, KC0[1].z 4 x: MOV R30.x, KC0[1].y y: MOV R30.y, KC0[1].y z: MOV R30.z, KC0[1].y w: MOV R30.w, KC0[1].y t: MOV R31.x, KC0[1].x 5 y: MOV R31.y, KC0[1].x z: MOV R31.z, KC0[1].x w: MOV R31.w, KC0[1].x t: MULLO_INT ____, R1.x, KC1[1].x 6 y: ADD_INT ____, R0.x, PS5 7 w: ADD_INT R27.w, PV6.y, KC1[6].x 8 z: LSHL R27.z, PV7.w, 2 9 y: ADD_INT ____, KC0[8].x, PV8.z 10 z: LSHR R0.z, PV9.y, 2 01 TEX: ADDR(10800) CNT(1) 11 VFETCH R32.x___, R0.z, fc153 MEGA(4) FETCH_TYPE(NO_INDEX_OFFSET) 02 ALU_PUSH_BEFORE: ADDR(581) CNT(10) 12 x: LSHR R33.x, R32.x, 24 y: BFE_UINT R26.y, R32.x, 0x00000008, 0x00000008 z: LSHR R24.z, R32.x, 8 w: BFE_UINT R25.w, R32.x, 0x00000010, 0x00000008 t: LSHR R27.y, R32.x, 16 13 w: AND_INT R26.w, R32.x, 0x000000FF 14 x: PREDNE_INT ____, R24.w, 0.0f UPDATE_EXEC_MASK UPDATE_PRED 03 JUMP ADDR(42) 04 ALU_PUSH_BEFORE: ADDR(591) CNT(2) KCACHE0(CB1:0-15) 15 x: PREDGT_INT ____, KC0[2].x, 8 UPDATE_EXEC_MASK UPDATE_PRED 05 JUMP ADDR(23) 06 ALU_PUSH_BEFORE: ADDR(593) CNT(2) KCACHE0(CB1:0-15) 16 x: PREDE_INT ____, KC0[2].x, 9 UPDATE_EXEC_MASK UPDATE_PRED 07 JUMP ADDR(9) 08 ALU: ADDR(595) CNT(20) KCACHE0(CB1:0-15) 17 x: LSHL T0.x, R26.y, 16 z: LSHL ____, R32.x, 8 w: LSHL ____, R26.w, 8 t: OR_INT R4.x, R33.x, 0x00008000 18 x: OR_INT ____, KC0[3].z, PV17.w y: OR_INT ____, KC0[3].y, PV17.w z: OR_INT ____, KC0[3].x, PV17.w w: OR_INT ____, KC0[3].w, PV17.w t: AND_INT T0.y, PV17.z, 0xFF000000 19 x: OR_INT ____, T0.x, PV18.x y: OR_INT ____, T0.x, PV18.y z: OR_INT ____, T0.x, PV18.z w: OR_INT ____, T0.x, PV18.w 20 x: OR_INT R2.x, PV19.z, T0.y y: OR_INT R2.y, PV19.y, T0.y z: OR_INT R2.z, PV19.x, T0.y w: OR_INT R2.w, PV19.w, T0.y 09 ELSE POP_CNT(1) ADDR(22) 10 PUSH ADDR(22) POP_CNT(1) 11 ALU: ADDR(615) CNT(2) KCACHE0(CB1:0-15) 21 x: PREDE_INT ____, KC0[2].x, 10 UPDATE_EXEC_MASK UPDATE_PRED 12 JUMP ADDR(14) 13 ALU: ADDR(617) CNT(23) KCACHE0(CB1:0-15) 22 x: LSHL ____, KC0[3].x, 8 y: LSHL T0.y, R26.w, 16 z: LSHL ____, KC0[3].y, 8 w: LSHL ____, R32.x, 16 t: LSHL ____, KC0[3].z, 8 23 x: OR_INT ____, PV22.y, PV22.x y: LSHL ____, KC0[3].w, 8 z: AND_INT T0.z, PV22.w, 0xFF000000 w: OR_INT ____, PV22.y, PV22.z t: OR_INT ____, PV22.y, PS22 24 x: OR_INT ____, PV23.z, PV23.x y: OR_INT ____, T0.y, PV23.y z: OR_INT ____, PV23.z, PS23 w: OR_INT ____, PV23.z, PV23.w t: OR_INT R4.x, R27.y, 0x00800000 25 x: OR_INT R2.x, KC0[1].z, PV24.x y: OR_INT ____, T0.z, PV24.y z: OR_INT R2.z, KC0[1].z, PV24.z t: OR_INT R2.y, KC0[1].z, PV24.w 26 w: OR_INT R2.w, KC0[1].z, PV25.y 14 ELSE ADDR(21) 15 PUSH ADDR(22) POP_CNT(2) 16 ALU: ADDR(640) CNT(2) KCACHE0(CB1:0-15) 27 x: PREDE_INT ____, KC0[2].x, 11 UPDATE_EXEC_MASK UPDATE_PRED 17 JUMP ADDR(19) 18 ALU: ADDR(642) CNT(21) KCACHE0(CB1:0-15) 28 x: LSHL T0.x, R32.x, 24 y: LSHL ____, KC0[3].x, 16 z: AND_INT ____, R24.z, 0x00FF00FF w: LSHL ____, R25.w, 8 t: LSHL ____, KC0[3].y, 16 29 x: LSHL ____, KC0[3].w, 16 y: LSHL ____, KC0[3].z, 16 z: OR_INT ____, PV28.w, PV28.z w: OR_INT T0.w, PV28.x, PV28.y t: OR_INT T0.z, PV28.x, PS28 30 x: OR_INT ____, T0.x, PV29.x y: OR_INT ____, T0.x, PV29.y t: OR_INT R4.x, PV29.z, 0x80000000 31 x: OR_INT R2.x, KC0[1].z, T0.w y: OR_INT R2.y, KC0[1].z, T0.z z: OR_INT R2.z, KC0[1].z, PV30.y w: OR_INT R2.w, KC0[1].z, PV30.x 19 ELSE POP_CNT(1) ADDR(21) 20 ALU_POP_AFTER: ADDR(663) CNT(5) 32 x: MOV R4.x, R28.x 33 x: MOV R2.x, R29.x y: MOV R2.y, R29.y z: MOV R2.z, R29.z w: MOV R2.w, R29.w 21 POP (2) ADDR(22) 22 ALU: ADDR(668) CNT(4) 34 x: MOV R0.x, R30.x y: MOV R0.y, R30.y z: MOV R0.z, R30.z w: MOV R0.w, R30.w 23 ELSE POP_CNT(1) ADDR(41) 24 ALU_PUSH_BEFORE: ADDR(672) CNT(2) KCACHE0(CB1:0-15) 35 x: PREDE_INT ____, KC0[2].x, 6 UPDATE_EXEC_MASK UPDATE_PRED 25 JUMP ADDR(27) 26 ALU: ADDR(674) CNT(28) KCACHE0(CB1:0-15) 36 x: LSHL ____, KC0[3].x, 8 y: LSHL T0.y, R26.w, 16 z: LSHL ____, KC0[3].y, 8 w: LSHL ____, R32.x, 16 t: LSHL ____, KC0[3].z, 8 37 x: OR_INT ____, PV36.y, PV36.x y: LSHL ____, KC0[3].w, 8 z: AND_INT T0.z, PV36.w, 0xFF000000 w: OR_INT ____, PV36.y, PV36.z t: OR_INT ____, PV36.y, PS36 38 x: OR_INT ____, PV37.z, PV37.x y: OR_INT ____, T0.y, PV37.y z: OR_INT ____, PV37.z, PS37 w: OR_INT ____, PV37.z, PV37.w t: OR_INT T0.x, R27.y, 0x00800000 39 x: OR_INT R0.x, KC0[1].y, PV38.x y: OR_INT ____, T0.z, PV38.y z: OR_INT R0.z, KC0[1].y, PV38.z t: OR_INT R0.y, KC0[1].y, PV38.w 40 x: MOV R2.x, T0.x y: MOV R2.y, T0.x z: MOV R2.z, T0.x w: OR_INT R0.w, KC0[1].y, PV39.y t: MOV R2.w, T0.x 41 x: MOV R4.x, R28.x 27 ELSE POP_CNT(1) ADDR(40) 28 PUSH ADDR(41) POP_CNT(2) 29 ALU: ADDR(702) CNT(2) KCACHE0(CB1:0-15) 42 x: PREDE_INT ____, KC0[2].x, 7 UPDATE_EXEC_MASK UPDATE_PRED 30 JUMP ADDR(32) 31 ALU: ADDR(704) CNT(26) KCACHE0(CB1:0-15) 43 x: LSHL ____, R25.w, 8 y: LSHL T0.y, R32.x, 24 z: LSHL ____, KC0[3].x, 16 w: AND_INT ____, R24.z, 0x00FF00FF t: LSHL ____, KC0[3].y, 16 44 x: LSHL ____, KC0[3].w, 16 y: LSHL ____, KC0[3].z, 16 z: OR_INT ____, PV43.x, PV43.w w: OR_INT ____, PV43.y, PV43.z t: OR_INT T0.z, PV43.y, PS43 45 x: OR_INT T0.x, T0.y, PV44.y y: OR_INT ____, PV44.z, 0x80000000 z: OR_INT T1.z, T0.y, PV44.x t: OR_INT R0.x, KC0[1].y, PV44.w 46 x: MOV R2.x, PV45.y y: MOV R2.y, PV45.y z: MOV R2.z, PV45.y w: MOV R2.w, PV45.y t: OR_INT R0.y, KC0[1].y, T0.z 47 z: OR_INT R0.z, KC0[1].y, T0.x w: OR_INT R0.w, KC0[1].y, T1.z 48 x: MOV R4.x, R28.x 32 ELSE ADDR(39) 33 PUSH ADDR(41) POP_CNT(3) 34 ALU: ADDR(730) CNT(2) KCACHE0(CB1:0-15) 49 x: PREDE_INT ____, KC0[2].x, 8 UPDATE_EXEC_MASK UPDATE_PRED 35 JUMP ADDR(37) 36 ALU: ADDR(732) CNT(21) KCACHE0(CB1:0-15) 50 x: AND_INT ____, R32.x, 0xFF0000FF y: LSHL ____, R26.y, 8 z: LSHL T0.z, R25.w, 16 w: LSHL T0.w, KC0[3].x, 24 t: LSHL T1.z, KC0[3].y, 24 51 x: LSHL ____, KC0[3].w, 24 y: LSHL ____, KC0[3].z, 24 w: OR_INT ____, PV50.y, PV50.x t: MOV R4.x, (0x00000080, 1.793662034e-43f).y 52 x: OR_INT R0.x, KC0[1].y, T0.w y: OR_INT ____, PV51.w, T0.z VEC_021 z: OR_INT R0.z, KC0[1].y, PV51.y w: OR_INT R0.w, KC0[1].y, PV51.x t: OR_INT R0.y, KC0[1].y, T1.z 53 x: MOV R2.x, PV52.y y: MOV R2.y, PV52.y z: MOV R2.z, PV52.y w: MOV R2.w, PV52.y 37 ELSE POP_CNT(1) ADDR(39) 38 ALU_POP_AFTER: ADDR(753) CNT(9) 54 x: MOV R4.x, R28.x 55 x: MOV R2.x, R29.x y: MOV R2.y, R29.y z: MOV R2.z, R29.z w: MOV R2.w, R29.w 56 x: MOV R0.x, R30.x y: MOV R0.y, R30.y z: MOV R0.z, R30.z w: MOV R0.w, R30.w 39 POP (2) ADDR(40) 40 POP (1) ADDR(41) 41 ALU: ADDR(762) CNT(4) 57 x: MOV R1.x, R31.x y: MOV R1.y, R31.y z: MOV R1.z, R31.z w: MOV R1.w, R31.w 42 ELSE POP_CNT(1) ADDR(76) 43 ALU_PUSH_BEFORE: ADDR(766) CNT(2) KCACHE0(CB1:0-15) 58 x: PREDGT_INT ____, KC0[2].x, 2 UPDATE_EXEC_MASK UPDATE_PRED 44 JUMP ADDR(62) 45 ALU_PUSH_BEFORE: ADDR(768) CNT(2) KCACHE0(CB1:0-15) 59 x: PREDE_INT ____, KC0[2].x, 3 UPDATE_EXEC_MASK UPDATE_PRED 46 JUMP ADDR(48) 47 ALU: ADDR(770) CNT(29) KCACHE0(CB1:0-15) 60 x: LSHL ____, R25.w, 8 y: LSHL T0.y, R32.x, 24 z: LSHL ____, KC0[3].x, 16 w: AND_INT ____, R24.z, 0x00FF00FF t: LSHL ____, KC0[3].y, 16 61 x: LSHL ____, KC0[3].w, 16 y: LSHL ____, KC0[3].z, 16 z: OR_INT ____, PV60.x, PV60.w w: OR_INT ____, PV60.y, PV60.z t: OR_INT T0.z, PV60.y, PS60 62 x: OR_INT T0.x, T0.y, PV61.y y: OR_INT ____, PV61.z, 0x80000000 z: OR_INT T1.z, T0.y, PV61.x t: OR_INT R1.x, KC0[1].x, PV61.w 63 x: MOV R0.x, PV62.y y: MOV R0.y, PV62.y z: MOV R0.z, PV62.y w: MOV R0.w, PV62.y t: OR_INT R1.y, KC0[1].x, T0.z 64 z: OR_INT R1.z, KC0[1].x, T0.x w: OR_INT R1.w, KC0[1].x, T1.z 65 x: MOV R2.x, R29.x y: MOV R2.y, R29.y z: MOV R2.z, R29.z w: MOV R2.w, R29.w 48 ELSE POP_CNT(1) ADDR(62) 49 PUSH ADDR(62) POP_CNT(1) 50 ALU: ADDR(799) CNT(2) KCACHE0(CB1:0-15) 66 x: PREDE_INT ____, KC0[2].x, 4 UPDATE_EXEC_MASK UPDATE_PRED 51 JUMP ADDR(53) 52 ALU: ADDR(801) CNT(25) KCACHE0(CB1:0-15) 67 x: LSHL ____, R26.y, 8 y: LSHL T0.y, R25.w, 16 z: LSHL T0.z, KC0[3].x, 24 w: AND_INT ____, R32.x, 0xFF0000FF t: LSHL T1.z, KC0[3].y, 24 68 x: LSHL T0.x, KC0[3].w, 24 y: LSHL T1.y, KC0[3].z, 24 z: OR_INT ____, PV67.x, PV67.w t: MOV R2.x, (0x00000080, 1.793662034e-43f).y 69 x: OR_INT ____, PV68.z, T0.y y: MOV R2.y, (0x00000080, 1.793662034e-43f).x z: MOV R2.z, (0x00000080, 1.793662034e-43f).x w: MOV R2.w, (0x00000080, 1.793662034e-43f).x t: OR_INT R1.x, KC0[1].x, T0.z 70 x: MOV R0.x, PV69.x y: MOV R0.y, PV69.x z: MOV R0.z, PV69.x w: MOV R0.w, PV69.x t: OR_INT R1.y, KC0[1].x, T1.z 71 z: OR_INT R1.z, KC0[1].x, T1.y w: OR_INT R1.w, KC0[1].x, T0.x 53 ELSE POP_CNT(1) ADDR(61) 54 PUSH ADDR(60) 55 ALU: ADDR(826) CNT(2) KCACHE0(CB1:0-15) 72 x: PREDE_INT ____, KC0[2].x, 5 UPDATE_EXEC_MASK UPDATE_PRED 56 JUMP ADDR(58) 57 ALU: ADDR(828) CNT(24) KCACHE0(CB1:0-15) 73 x: LSHL T0.x, R26.y, 16 y: OR_INT T1.y, R33.x, 0x00008000 z: LSHL ____, R32.x, 8 VEC_120 w: LSHL ____, R26.w, 8 74 x: OR_INT ____, KC0[3].z, PV73.w y: OR_INT ____, KC0[3].y, PV73.w z: OR_INT ____, KC0[3].x, PV73.w w: OR_INT ____, KC0[3].w, PV73.w t: AND_INT T0.y, PV73.z, 0xFF000000 75 x: OR_INT ____, T0.x, PV74.x y: OR_INT ____, T0.x, PV74.y z: OR_INT ____, T0.x, PV74.z w: OR_INT ____, T0.x, PV74.w t: MOV R2.x, T1.y 76 x: OR_INT R0.x, PV75.z, T0.y y: OR_INT R0.y, PV75.y, T0.y z: OR_INT R0.z, PV75.x, T0.y w: OR_INT R0.w, PV75.w, T0.y t: MOV R2.y, T1.y 77 z: MOV R2.z, T1.y w: MOV R2.w, T1.y 58 ELSE POP_CNT(1) ADDR(60) 59 ALU_POP_AFTER: ADDR(852) CNT(8) 78 x: MOV R2.x, R29.x y: MOV R2.y, R29.y z: MOV R2.z, R29.z w: MOV R2.w, R29.w 79 x: MOV R0.x, R30.x y: MOV R0.y, R30.y z: MOV R0.z, R30.z w: MOV R0.w, R30.w 60 ALU_POP_AFTER: ADDR(860) CNT(4) 80 x: MOV R1.x, R31.x y: MOV R1.y, R31.y z: MOV R1.z, R31.z w: MOV R1.w, R31.w 61 POP (1) ADDR(62) 62 ELSE POP_CNT(1) ADDR(75) 63 ALU_PUSH_BEFORE: ADDR(864) CNT(1) KCACHE0(CB1:0-15) 81 x: PREDE_INT ____, KC0[2].x, 1 UPDATE_EXEC_MASK UPDATE_PRED 64 JUMP ADDR(66) 65 ALU: ADDR(865) CNT(24) KCACHE0(CB1:0-15) 82 x: LSHL T0.x, R26.y, 16 y: OR_INT T1.y, R33.x, 0x00008000 z: LSHL ____, R32.x, 8 VEC_120 w: LSHL ____, R26.w, 8 83 x: OR_INT ____, KC0[3].z, PV82.w y: OR_INT ____, KC0[3].y, PV82.w z: OR_INT ____, KC0[3].x, PV82.w w: OR_INT ____, KC0[3].w, PV82.w t: AND_INT T0.y, PV82.z, 0xFF000000 84 x: OR_INT ____, T0.x, PV83.x y: OR_INT ____, T0.x, PV83.y z: OR_INT ____, T0.x, PV83.z w: OR_INT ____, T0.x, PV83.w t: MOV R0.x, T1.y 85 x: OR_INT R1.x, PV84.z, T0.y y: OR_INT R1.y, PV84.y, T0.y z: OR_INT R1.z, PV84.x, T0.y w: OR_INT R1.w, PV84.w, T0.y t: MOV R0.y, T1.y 86 z: MOV R0.z, T1.y w: MOV R0.w, T1.y 66 ELSE POP_CNT(1) ADDR(74) 67 PUSH ADDR(74) POP_CNT(1) 68 ALU: ADDR(889) CNT(2) KCACHE0(CB1:0-15) 87 x: PREDE_INT ____, KC0[2].x, 2 UPDATE_EXEC_MASK UPDATE_PRED 69 JUMP ADDR(71) 70 ALU: ADDR(891) CNT(27) KCACHE0(CB1:0-15) 88 x: LSHL ____, KC0[3].x, 8 y: LSHL T0.y, R26.w, 16 z: LSHL ____, KC0[3].y, 8 w: LSHL ____, R32.x, 16 t: LSHL ____, KC0[3].z, 8 89 x: OR_INT ____, PV88.y, PV88.x y: LSHL ____, KC0[3].w, 8 z: AND_INT T0.z, PV88.w, 0xFF000000 w: OR_INT ____, PV88.y, PV88.z t: OR_INT ____, PV88.y, PS88 90 x: OR_INT ____, PV89.z, PV89.x y: OR_INT ____, T0.y, PV89.y z: OR_INT ____, PV89.z, PS89 w: OR_INT ____, PV89.z, PV89.w t: OR_INT T0.x, R27.y, 0x00800000 91 x: OR_INT R1.x, KC0[1].x, PV90.x y: OR_INT ____, T0.z, PV90.y z: OR_INT R1.z, KC0[1].x, PV90.z t: OR_INT R1.y, KC0[1].x, PV90.w 92 x: MOV R0.x, T0.x y: MOV R0.y, T0.x z: MOV R0.z, T0.x w: OR_INT R1.w, KC0[1].x, PV91.y t: MOV R0.w, T0.x 71 ELSE POP_CNT(1) ADDR(73) 72 ALU_POP_AFTER: ADDR(918) CNT(8) 93 x: MOV R0.x, R30.x y: MOV R0.y, R30.y z: MOV R0.z, R30.z w: MOV R0.w, R30.w 94 x: MOV R1.x, R31.x y: MOV R1.y, R31.y z: MOV R1.z, R31.z w: MOV R1.w, R31.w 73 POP (1) ADDR(74) 74 ALU_POP_AFTER: ADDR(926) CNT(4) 95 x: MOV R2.x, R29.x y: MOV R2.y, R29.y z: MOV R2.z, R29.z w: MOV R2.w, R29.w 75 ALU_POP_AFTER: ADDR(930) CNT(1) 96 x: MOV R4.x, R28.x 76 ALU: ADDR(931) CNT(120) 97 x: BIT_ALIGN_INT T0.x, R1.x, R1.x, 0x00000018 y: BIT_ALIGN_INT T0.y, R1.w, R1.w, 0x00000018 z: BIT_ALIGN_INT T0.z, R1.z, R1.z, 0x00000018 w: BIT_ALIGN_INT T0.w, R1.y, R1.y, 0x00000018 98 x: BIT_ALIGN_INT R123.x, R1.x, R1.x, 0x00000008 y: BIT_ALIGN_INT R123.y, R1.w, R1.w, 0x00000008 z: BIT_ALIGN_INT R123.z, R1.z, R1.z, 0x00000008 w: BIT_ALIGN_INT R123.w, R1.y, R1.y, 0x00000008 99 x: BFI_INT R7.x, 0x00FF00FF, T0.x, PV98.x y: BFI_INT R8.y, 0x00FF00FF, T0.y, PV98.y z: BFI_INT R8.z, 0x00FF00FF, T0.z, PV98.z w: BFI_INT R7.w, 0x00FF00FF, T0.w, PV98.w 100 y: BIT_ALIGN_INT R28.y, 0x67452301, 0x67452301, 0x00000002 z: BIT_ALIGN_INT R25.z, 0xEFCDAB89, 0xEFCDAB89, 0x00000002 w: BIT_ALIGN_INT R123.w, 0x67452301, 0x67452301, 0x0000001B 101 x: BIT_ALIGN_INT T0.x, R0.x, R0.x, 0x00000018 y: ADD_INT R32.y, PV100.w, 1552793326 z: BIT_ALIGN_INT T0.z, R0.y, R0.y, 0x00000018 w: BIT_ALIGN_INT T0.w, R0.z, R0.z, 0x00000018 t: XOR_INT R26.z, PV100.z, PV100.y 102 x: ADD_INT ____, R8.z, PV101.y y: ADD_INT ____, R7.x, PV101.y z: ADD_INT ____, R7.w, PV101.y w: ADD_INT ____, R8.y, PV101.y t: OR_INT R28.w, R25.z, 0x98BADCFE 103 x: ADD_INT T2.x, PV102.x, 1518500249 y: ADD_INT T2.y, PV102.y, 1518500249 z: ADD_INT T2.z, PV102.z, 1518500249 w: ADD_INT T3.w, PV102.w, 1518500249 104 x: BIT_ALIGN_INT R123.x, R0.y, R0.y, 0x00000008 y: BIT_ALIGN_INT R123.y, R0.x, R0.x, 0x00000008 z: BIT_ALIGN_INT T1.z, R0.w, R0.w, 0x00000018 w: BIT_ALIGN_INT R123.w, R0.z, R0.z, 0x00000008 t: AND_INT T0.y, PV103.y, R26.z 105 x: BFI_INT R8.x, 0x00FF00FF, T0.z, PV104.x VEC_102 y: BFI_INT R9.y, 0x00FF00FF, T0.x, PV104.y z: BIT_ALIGN_INT R123.z, R0.w, R0.w, 0x00000008 w: BFI_INT R8.w, 0x00FF00FF, T0.w, PV104.w t: AND_INT T0.x, T2.z, R26.z 106 x: BIT_ALIGN_INT T1.x, T2.z, T2.z, 0x0000001B y: BIT_ALIGN_INT T1.y, T2.y, T2.y, 0x0000001B z: BFI_INT R9.z, 0x00FF00FF, T1.z, PV105.z VEC_021 w: BIT_ALIGN_INT T0.w, T2.x, T2.x, 0x0000001B t: AND_INT ____, T2.x, R26.z 107 x: AND_INT ____, T3.w, R26.z y: XOR_INT T0.y, R25.z, T0.y z: BIT_ALIGN_INT R123.z, T3.w, T3.w, 0x0000001B w: XOR_INT T1.w, R25.z, T0.x t: XOR_INT T2.w, R25.z, PS106 108 x: ADD_INT ____, T1.x, R28.w y: ADD_INT ____, T1.y, R28.w z: XOR_INT T0.z, R25.z, PV107.x w: ADD_INT ____, T0.w, R28.w t: ADD_INT ____, PV107.z, R28.w 109 x: ADD_INT ____, R8.x, PV108.x y: ADD_INT ____, R9.y, PV108.y z: ADD_INT ____, R9.z, PS108 w: ADD_INT ____, R8.w, PV108.w 110 x: ADD_INT R1.x, PV109.x, 1790234127 y: ADD_INT R1.y, PV109.y, 1790234127 z: ADD_INT R1.z, PV109.z, 1790234127 w: ADD_INT R0.w, PV109.w, 1790234127 111 x: BIT_ALIGN_INT T1.x, R2.y, R2.y, 0x00000018 y: BIT_ALIGN_INT T1.y, R2.x, R2.x, 0x00000018 z: BIT_ALIGN_INT T1.z, R2.w, R2.w, 0x00000018 w: BIT_ALIGN_INT T0.w, R2.z, R2.z, 0x00000018 112 x: BIT_ALIGN_INT R123.x, R2.y, R2.y, 0x00000008 y: BIT_ALIGN_INT R123.y, R2.x, R2.x, 0x00000008 z: BIT_ALIGN_INT R123.z, R2.w, R2.w, 0x00000008 w: BIT_ALIGN_INT R123.w, R2.z, R2.z, 0x00000008 113 x: BFI_INT R9.x, 0x00FF00FF, T1.x, PV112.x y: BFI_INT R10.y, 0x00FF00FF, T1.y, PV112.y z: BFI_INT R10.z, 0x00FF00FF, T1.z, PV112.z w: BFI_INT R9.w, 0x00FF00FF, T0.w, PV112.w 114 x: BIT_ALIGN_INT R123.x, R1.x, R1.x, 0x0000001B y: BIT_ALIGN_INT R123.y, R1.y, R1.y, 0x0000001B z: BIT_ALIGN_INT R123.z, R1.z, R1.z, 0x0000001B w: BIT_ALIGN_INT R123.w, R0.w, R0.w, 0x0000001B 115 x: ADD_INT ____, T1.w, PV114.x y: ADD_INT ____, T0.y, PV114.y z: ADD_INT ____, T0.z, PV114.z w: ADD_INT ____, T2.w, PV114.w VEC_120 116 x: ADD_INT ____, PV115.x, R9.x y: ADD_INT ____, PV115.y, R10.y z: ADD_INT ____, PV115.z, R10.z w: ADD_INT ____, PV115.w, R9.w 117 x: ADD_INT R2.x, PV116.x, -214083945 y: ADD_INT R2.y, PV116.y, -214083945 z: ADD_INT R2.z, PV116.z, -214083945 w: ADD_INT R3.w, PV116.w, -214083945 118 x: BIT_ALIGN_INT R3.x, T2.z, T2.z, 0x00000002 y: BIT_ALIGN_INT R3.y, T2.y, T2.y, 0x00000002 z: BIT_ALIGN_INT R3.z, T3.w, T3.w, 0x00000002 w: BIT_ALIGN_INT R4.w, T2.x, T2.x, 0x00000002 119 x: BIT_ALIGN_INT R0.x, R4.x, R4.x, 0x00000008 y: BIT_ALIGN_INT R0.y, R4.x, R4.x, 0x00000018 z: BIT_ALIGN_INT R0.z, R2.x, R2.x, 0x0000001B VEC_120 w: BIT_ALIGN_INT R2.w, R2.y, R2.y, 0x0000001B t: XOR_INT R1.w, R28.y, PV118.y

                                                                                              • Can I use BFI_INT directly from IL ?
                                                                                                MicahVillmow
                                                                                                hazeman,
                                                                                                I understand the frustration, but the release cycle of catalyst drivers is about three months. So, what you guys are getting soon is what we added 2-3 months ago.
                                                                                                  • Can I use BFI_INT directly from IL ?
                                                                                                    corry

                                                                                                     

                                                                                                    Originally posted by: MicahVillmow hazeman, I understand the frustration, but the release cycle of catalyst drivers is about three months. So, what you guys are getting soon is what we added 2-3 months ago.


                                                                                                     

                                                                                                    Wait wait wait,

                                                                                                     

                                                                                                    Originally posted by: MicahVillmowIn 11.10 release, the CAL compiler will expose bfi to the OpenCL compiler and in SDK 2.6, the OpenCL compiler will take advantage of/generate the bfi instruction


                                                                                                    Please, don't ruin my day, say the first statement you made in this is true :) 

                                                                                                    Edit:  The first statement chronologically, not how I ordered it here :)  x86 ASM, and GPU IL feeds my dyslexia ok!? :)

                                                                                                    • Can I use BFI_INT directly from IL ?
                                                                                                      hazeman

                                                                                                       

                                                                                                      Originally posted by: MicahVillmow hazeman, I understand the frustration, but the release cycle of catalyst drivers is about three months. So, what you guys are getting soon is what we added 2-3 months ago.


                                                                                                      I don't really have problem with 3 months dev cycle. But I've problem with the fact that this thread started 12 !!! months ago. Now you say that bfi should be available in 11.9 and it isn't. So I've right to expect that due to some bug bfi isn't available at all. And what you are now saying is that I should wait for 3 months before you start looking for this bug. And then I should wait for next 3 months before I can recheck if the bug is really fixed. And instead of 3 month I have to wait half a year. Sorry but this is a joke for me.

                                                                                                      So please really test ( run some IL kernel with bfi ) if the bfi instruction works with the internal driver version and cut the time we have to wait for bfi by 3 ( or more ) months. I think AMD customers deserve as much ! Specialy that we had to already wait for a year !

                                                                                                       

                                                                                                       

                                                                                                       

                                                                                                    • Can I use BFI_INT directly from IL ?
                                                                                                      MicahVillmow
                                                                                                      hazeman,
                                                                                                      I understand, I reported the issue at that time this thread started. The problem was that the underlying compiler(which OpenCL doesn't control directly) did not expose the instruction until July of this year, even though they added some simple optimizations before that. OpenCL added support for it in August once we got access to the compiler, and then 3 months later it should appear in Catalyst.

                                                                                                      I know this took way to long to get public, but it is coming soon and it has been thoroughly tested.