• OpenCL: "AMD HSA Code Object loading failed" in clBuildProgram on AMD Radeon RX 5700 XT

    Hi,   Developer of PTGui here (www.ptgui.com).   One of the users of my software has reported an error while building OpenCL kernels on a Radeon RX 5700 XT on Windows 10. clBuildProgram returns "AMD HSA Co...
    joostn
    last modified by joostn
  • Bug in OpenCL compiler

    I found a bug in OpenCL compiler in the latest drivers. At least it is persistent in Adrenalin 19.5.2 and 19.8.1. Minimal reproducing example is included in the file. It just multiplies several complex numbers in a lo...
    melirius
    last modified by melirius
  • Strange printf behaviour on Vega

    Tested on latest 19.10.1 drivers. Windows 10 x64 1903 I attached cl file and cpp program which would launch this simple addVec kernel. Opencl code: #pragma OPENCL EXTENSION cl_amd_printf : enable __attribute__((req...
    ___
    last modified by ___
  • Offline compilation for gfx1010 crashes

    When I try to compile any OpenCL source for gfx1010, the test application crashes in one of the AMD driver DLLs. Tested with Adrenalin 19.7.1 and 19.7.3 on Windows 10 and Windows 7.   That's a crash report I am...
    timchist
    last modified by timchist
  • OpenCL occupancy-performance nightmare

    These days I tried to squeeze some performance from a memory-intensive OCL kernel and went for GCN assembly. Saved a few registers here, few instructions there, got a nice occupancy and thought to have a perfect kerne...
    kbala
    last modified by kbala
  • How to compile offline with LLVM-8 for AMDPAL

    Dear community,   I am planning to ship a software using binary kernels with inline asm. Therefore I decided to go with LLVM based offline compile, since the buildin pal compiler can not handle this. Hereby it i...
    lolliedieb
    last modified by lolliedieb
  • I am trying to testout how well atomicity performs on APU. But my sample program hangs the system

    I am trying to testout how well atomicity performs on APU. But my sample program does not update the variable properly hence whole system hangs as I check for updated value at either side (cpu and gpu)  in while ...
    avinashkrc
    last modified by avinashkrc
  • OpenCL compiler SIGSEGV'ing

    Hello!   So I've been writhing my OpenCL project until I run into a strange problem: a call to clBuildProgram results in a segmentation fault.   I'm providing a sample code here (see attached archive) that...
    be_dos
    last modified by be_dos
  • Strategies on reducing VGPR usage - and, where do they come from?

    Aside from a detrimental memory latency issue I reported in this thread, I also noticed that my OpenCL code on AMD GPUs suffered from large VGPR usage.   For the voxel-based Monte Carlo simulator, MCXCL (https:/...
    FangQ
    last modified by FangQ
  • OpenCL driver bug

    EDIT: reformat  EDIT 2: correct driver version   Found a weird behavior in AMD's OpenCL compiler. Code taken straight from Boost library:   __kernel void serial_adjacent_find(const uint size, __global...
    rosenrodt
    last modified by rosenrodt
  • OpenCL compilation hangs forever

    Hi all,   I am trying to compile this project for an AMD GPU: GitHub - webmaster128/lisk-vanity: A tool to generate short Lisk addresses with GPU support   The c.l files are in lisk-vanity/src/opencl at m...
    webmaster128
    last modified by webmaster128
  • SPIR support in new drivers lost

    I already ask this question in Drivers & Software section but nobody answer.   --------------------------------------------------------------------------------------------------------------------------------...
    ipse
    last modified by ipse
  • OpenCL amdgpu-pro generated code performance - please convert 'select' to cndmask

    Hi,   I don't know if this place is the best place to report opencl compiler performance issues, but well I didn't find a better place.   SUMMARY: Please AMD devs, when an OpenCL dev takes the time to expl...
    mannerov
    last modified by mannerov
  • Store array in regs?

    If I made an array like uint[128], the driver will spill it even if there is enough registers to store this array. Any way I can do to let compiler store big array in registers? Maybe some compile option?
    fancyix
    last modified by fancyix
  • List of neural network/machine learning/GPU computing apps that support OpenCL acceleration on AMD Fx HW?

    Hi, I have a few questions. I hope you can help me.   I am trying to learn neural nets/ML on my older, Fx based hardware.   I very much prefer the openCL development model. As discussed elsewhere, people ...
    devlista
    last modified by devlista
  • Any way to avoid using too many VGPRs?

    Is there anything like cuda's "register" keyword hinting the compiler to store the value of one variable in one register, instead of using many registers for storing its temporary value? I tried "volatile" but sometim...
    fancyix
    last modified by fancyix
  • Please add new extension for refined reduce in wavefront

    Hi,   According to https://gpuopen.com/amd-gcn-assembly-cross-lane-operations/  the hardware is able to do refined reduce operations.   By 'refined', I have in mind doing an add/min/max among neighbor...
    mannerov
    last modified by mannerov
  • Strange behavior of a kernel, need fresh ideas

    I lost two days debugging or better to say tried to debug my kernel. Basically the kernel looks like this (part of dagger-hashimoto initialization):   1. copy from global to private 2. do private 3. copy from ...
    kbala
    last modified by kbala
  • Missing OpenCL CPU support under Windows

    On my system the (i think) most recent version of the AMD drivers (18.8.1, Windows 10 x64) no longer returns the CPU (FX-8350) as a valid OpenCL device.  Is this intended behavior or just a bug in my specific ins...
    pangea
    last modified by pangea
  • How to tune the performance of ROCm(llvm) compiler?

    I modified llvm (roc-1.6.x) a bit to generate a code that can run on AMDGPU pro dirver. It can run but the performance is over 10% slower than AMDGPU's online compiler, for the same opencl code.  I wonder if ther...
    fancyix
    last modified by fancyix