• Modern GPU: book-length tutorial and open-source GPGPU library

    Fastest GPU radix sort and scan-centric tutorial. Hi all. I've been putting together a big book-length online GPU computing tutorial. It's at http://www.moderngpu.com/ The content is very scan/reduction-centri...
    BarnacleJunior
    last modified by BarnacleJunior
  • Please add new extension for refined reduce in wavefront

    Hi,   According to https://gpuopen.com/amd-gcn-assembly-cross-lane-operations/  the hardware is able to do refined reduce operations.   By 'refined', I have in mind doing an add/min/max among neighbor...
    mannerov
    last modified by mannerov
  • Strange behavior of a kernel, need fresh ideas

    I lost two days debugging or better to say tried to debug my kernel. Basically the kernel looks like this (part of dagger-hashimoto initialization):   1. copy from global to private 2. do private 3. copy from ...
    kbala
    last modified by kbala
  • Missing OpenCL CPU support under Windows

    On my system the (i think) most recent version of the AMD drivers (18.8.1, Windows 10 x64) no longer returns the CPU (FX-8350) as a valid OpenCL device.  Is this intended behavior or just a bug in my specific ins...
    pangea
    last modified by pangea
  • How to tune the performance of ROCm(llvm) compiler?

    I modified llvm (roc-1.6.x) a bit to generate a code that can run on AMDGPU pro dirver. It can run but the performance is over 10% slower than AMDGPU's online compiler, for the same opencl code.  I wonder if ther...
    fancyix
    last modified by fancyix
  • Disappointing opencl half-precision performance on vega - any advice?

    I bought a Vega 64 recently. From the specs, it has 23 TFLOPs fp16 throughput compared to 12 TFLOP fp32. so I converted portion of my Monte Carlo code to half, expecting to gain some noticeable speed up. Disappointing...
    FangQ
    last modified by FangQ
  • How to controll opencl kernel configuration of assembly code generated by llvm clang AMDGPU backend?

    llvm clang can compile opencl file into assembly. A common format is hsa. There are certain configurations in hsa assembly file, such as enable_sgpr_dispatch_ptr and enable_sgpr_queue_ptr. When I compile my opencl fil...
    fancyix
    last modified by fancyix
  • How to use llvm to offline compile .cl file into binary that can run with amdgpu pro driver?

    Now I am trying to build OpenCL kernel binary with llvm. I successfully compiled .cl into assembly, but cannot figure out a way to compile that format of assembly into binary that can run with AMDGPU pro driver. That ...
    fancyix
    last modified by fancyix
  • Can OpenCL build program from .cl source code together with a pre-built binary kernel?

    My program has several kernels. I'd like to use offline compiler to compile one kernel into binary. So how can I build my program using other kernels and that one pre-built kernel binary?
    fancyix
    created by fancyix
  • How to compile .cl file that contains inline assembly for GCN cards?

    There are some examples of inline assembly inside .cl file: LLVM-AMDGPU-Assembler-Extra/s_memrealtime_inline.cl at master · ROCm-Developer-Tools/LLVM-AMDGPU-Assembler-Extra · GitHu… gatelessgate/equ...
    fancyix
    last modified by fancyix
  • OpenCL linker hangs & terminates application on R9 200

    After shipping our application, some users with AMD R9 200 series cards report the application hangs up and then quits. After studying log files and minidumps it seems the issue is the with OpenCL linker on those syst...
    george72
    last modified by george72
  • floating point precision

    Hi, In my current kernel I use floating point values, but I think I got precision problems... by example, I use an 'epsilon' defined as #define EPSILON 1e-4 and play with values like 500.f etc.... I'm not sure how...
    spectral
    last modified by spectral
  • Three cheers for anonymous AMD engineer that fixed OpenCL driver bug

    My app makes very heavy use of OpenCL events. For the longest time, the app was unstable - seemingly there was a race condition where event callbacks sometimes would not get called, causing my app to stop working. ...
    boxerab
    last modified by boxerab
  • Legacy OpenCL analysis not working with v18 drivers on Ubuntu

    Hi,   With any v18 amdgpu-pro driver, it is not possible to analyze Kernels for non-Vega ASICs   The error is always "Error: failed to disassemble binary output and produce textual ISA code for Hawaii (ke...
    greenstheorem
    last modified by greenstheorem
  • OpenCL compiler bug with big switches

    Hi. Recently I found an issue with AMD OpenCL compiler. I got OpenCL code that generated on CPU and results in big switch-case construction (about 4k case). Attempt to compile such kernel with AMD compiler leads to...
    kvalme
    last modified by kvalme
  • When will the AMDGPU-PRO driver support HD7900 series?

    My card is HD7990. And my latest amdgpu-pro driver 17.50 on Linux doesn't support my card, and it exports a lot of errors for all tensorflow tests. Also, the famous hello world opencl program doesn't work.   To ...
    karlcauchy
    last modified by karlcauchy
  • OpenCL & Linux on Raven Ridge

    Hello, This is more of a driver feature request but it is OpenCL specific. I've purchased a month ago a Riven Ridge APU (Ryzen 2400G) system and I'm eager to use OpenCL on its integrated Vega GPU. Unfortunately, it s...
    ekondis
    last modified by ekondis
  • OpenCL SDK for AMD EPYC and Hawaii architecture GPUs

    I am trying to setup a OpenCL environment for the following hardwares and operating system. CPU : AMD EPYC 7551P GPU : AMD FirePro S9150 (Hawaii) OS : CentOS 7.4   I've succesfully installed AMDGPU-PRO for th...
    ep-98d
    last modified by ep-98d
  • Various CL faults with vega on windows...

    I have a 290X and vega64 in the same system, the v64 is the primary card, however in the context of openCL it's device '1', with the 290X being '0'. This is all running windows 10 pro 64 with the latest 18.3.4 drivers...
    paul17041993
    last modified by paul17041993
  • Bug in AMD driver

    Hello community,   First, about me: I'm a student from Germany. I am studying Computational Engineering and I write GPU-accelerated numerical software.   With the current AMD Radeon 18.2.1 drivers, there ...
    robin.christ@gmx.de
    last modified by robin.christ@gmx.de