• Missing lock step behaviour of Navi GPUs?

    Hey there,   I got a code that needs to share data among threads in blocks of 4, so thread i needs to access values from threads (i & 0xFC) + 0 ... (i & 0xFC) + 3.   When writing such a code in GCN...
    lolliedieb
    last modified by lolliedieb
  • The OpenCL General Tuning Issue

    All OpenCL versions form all vendors have this issue. It is a wrong computation. Please take a look at my blog describing it in detail. Can this be fixed on AMD OpenCL anyhow?   https://iblog.isowa.io/2020/01/04...
    sowson
    last modified by sowson
  • Is there a way to combine OpenCL engine from older driver pack with newer one?

    As I found out, newer OpenCL compiler in Adrenalin drivers for Win10 x64 have a bug in realization that prevents my code to work correctly on Tahiti cards. Then I determined that old driver pack of 16.4.2 was without ...
    melirius
    last modified by melirius
  • Wrong OpenCL calculation result on AMD 5700 XT

    Good day!   Our company uses OpenCL framework to work with AMD GPUs. But unfortunately, the OpenCL driver for AMD 5700 XT GPU gives wrong calculation results. This applies for all GPU drivers I have tested so fa...
    Neverhood
    last modified by Neverhood
  • Strange printf behaviour on Vega

    Tested on latest 19.10.1 drivers. Windows 10 x64 1903 I attached cl file and cpp program which would launch this simple addVec kernel. Opencl code: #pragma OPENCL EXTENSION cl_amd_printf : enable __attribute__((req...
    ___
    last modified by ___
  • Bug in OpenCL compiler

    Finally I made a minimal reproducing example of a bug in OpenCL compilers for Thaiti in Adrenalin Win10 x64 drivers (tested on two workstations with 19.12.2, 20.1.1 and 20.5.1 drivers with -O0 and -O5). Kernel is atta...
    melirius
    last modified by melirius
  • OpenCL 2.0 Compiler Bug?

    Hello, In my OpenCL kernel I'm using the "async_work_group_copy" function to copy data from global to local memory. However, whenever I use the "wait_group_events" function in the kernel, and I compile with OpenCL 2....
    jadr
    last modified by jadr
  • Is there an elegant way to force recalculation (of values or addresses)

    Well the question in the title already hits it. I got a rather simple kernel, which uses 20 vgpr and the complete 32 kByte of shared memory. So all fine for running 2x 1024 threads per work group. So fine so far. Bu...
    lolliedieb
    last modified by lolliedieb
  • Pull Request I made for the clBLAS

    Hello,   I have a question about the Pull Request I made for the clBLAS... I am waiting quite long for "accept"... and I wonder if someone can check it..? it is at... I would really appreciate that.   ...
    sowson
    created by sowson
  • clBuildProgram prints warnings when compiling for RDNA

    I am using Radeon Pro W5700 to run kernels produced by clfft library.   When clfft compiles its kernels, it seems that calling clBuildProgram prints unspecified warnings to the console output:   "1 warning...
    elad
    last modified by elad
  • OpenCL 2.0 compiler bug? (device side enqueue)

    A similar issue is reported here.   I compile a kernel (kernel1) that performs device-side enqueue to another kernel (kernel2). When kernel2 is empty, or contains little code, there is no problem.    ...
    elad
    last modified by elad
  • Khronos Group Releases OpenCL 3.0 is AMD implement it?

    More information you may find at Khronos Group Releases OpenCL 3.0 - The Khronos Group Inc  Thanks!
    sowson
    created by sowson
  • Why the EC calculation is NOT correct?

    Dear,           I am trying to porting the openCL source code to AMD old gpu card: Rx570 (4G)。The source code can work correctly on Nvidia cards, but it failed on Rx570 card.    ...
    block.lee
    last modified by block.lee
  • parameter passing for pipes in nested loop(deviceEnqueue)

    Im trying to implementation G-DBSCN in Qcom mobile platform(845/865), when in BFS part i just refrenced the sample code in DeviceEnqueueBFS in OpenCL SDK 3.0.  At first: at Qcom mobile GPU platform(84...
    youngerliu
    last modified by youngerliu
  • OpenCL Shader compiler had memory allocation problem

    I'm trying to compile a rather large kernel and being give the following error after 20-40 sec of kernel compile time, both in the runtime as well as under CodeXL:   Shader compiler had memory allocation problem...
    glupescu
    last modified by glupescu
  • OpenCL compiler crash

    Hi everyone.   When I try to compile OpenCL kernel on my notebook with Radeon HD 7340 GPU, I have segfault inside clBuildProgram function. If I comment line  b += (val << r[i]) | (val >> (...
    eltio
    last modified by eltio
  • How to access more then 32k byte shared memory on Vega & Navi using Windows?

    Hi all. Well the title already describes it. I have got a code using 64k LDS on a Radeon VII and a RX 5700. Work group size is 1024. Its working fine on Ubuntu 16.04 and 18.04 using amdgpu-pro 18.50, 19.30 and ROCm ...
    lolliedieb
    last modified by lolliedieb
  • Offline compilation for gfx1010 crashes

    When I try to compile any OpenCL source for gfx1010, the test application crashes in one of the AMD driver DLLs. Tested with Adrenalin 19.7.1 and 19.7.3 on Windows 10 and Windows 7.   That's a crash report I am...
    timchist
    last modified by timchist
  • OpenCL: "AMD HSA Code Object loading failed" in clBuildProgram on AMD Radeon RX 5700 XT

    Hi,   Developer of PTGui here (www.ptgui.com).   One of the users of my software has reported an error while building OpenCL kernels on a Radeon RX 5700 XT on Windows 10. clBuildProgram returns "AMD HSA Co...
    joostn
    last modified by joostn
  • Bug in OpenCL compiler

    I found a bug in OpenCL compiler in the latest drivers. At least it is persistent in Adrenalin 19.5.2 and 19.8.1. Minimal reproducing example is included in the file. It just multiplies several complex numbers in a lo...
    melirius
    last modified by melirius