• Issues with the RX 5700 XT

    Hello, I brought a new RX 5700 XT gpu last week (Sapphire Nitro+ RX 5700 XT SE) and it seems like the opencl is broken. It started with persistent crashes when using blender, and then I discovered that there is some ...
    avielw
    last modified by avielw
  • How to read GDS using PM4 packets in Vega 10?

    Hi, I recently do some experiments with the global data share(GDS). I wrote a kernel that simply writes some contents into GDS starting from address 0 using ds_write instructions. The question is how to read the cont...
    harry1
    last modified by harry1
  • Wrong OpenCL calculation result on AMD 5700 XT

    Good day!   Our company uses OpenCL framework to work with AMD GPUs. But unfortunately, the OpenCL driver for AMD 5700 XT GPU gives wrong calculation results. This applies for all GPU drivers I have tested so fa...
    Neverhood
    last modified by Neverhood
  • Kernel crashes if loop cycles too high in too many items

    My items run variable number of loop cycles, from 1 to X, and when X is relatively low, on the ballpark of 16k, the entire work completes successfully. When X grows to 65k, and the percentage of items that reach this...
    infovel
    last modified by infovel
  • How to enable cl_ext_atomic_counters_32 on Vega 56

    Hi, I wanted to use cl_ext_atomic_counters_32 on Vega 56. I found that it needs the legacy driver. Is There a Way to Access Global Data Share (GDS) on Ellesmere (RX 480)?  I have tried on Ubuntu 18.04 using...
    harry1
    last modified by harry1
  • Radeon HD5850 vs 7770, 7850, 6530(APU), or 6450

    A while ago, I developed a kernel that runs fine on the 5850 card. When I had an opportunity to test it on the newer 7770 and 7850 cards, it did not work. The result was wrong, it messed up the mandelbrot set, even t...
    infovel
    last modified by infovel
  • Can scalar and vector operations run in parallel in GCN?

    Well the question already tells most of what I would like to know. Since vector and scalar ALU are different pieces of hardware I wondered if the two alus can be active in parallel when the instructions running have n...
    lolliedieb
    last modified by lolliedieb
  • How to force placement of memory buffer in Windows?

    Hello,   2nd question today, but more or less unrelated to the first one. I have an application, that needs to use almost the complete memory that the GPU offers (up to 5 MByte). The driver reports (via CL_DEVIC...
    lolliedieb
    last modified by lolliedieb
  • CNN DarkNet on OpenCL

    Hello, recently I made the DarkNet on OpenCL that is technology that really passionate me and I started recently PhD studies on AI field. I am using few different GPUs, recently 2 x AMD Radeon VII that works very well...
    sowson
    last modified by sowson
  • OpenCL compiler bug

    I've been working on adding OpenCL support to our code generator (GitHub - genn-team/genn: GeNN is a GPU-enhanced Neuronal Network simulation environment based on code generation for Nvi… ) and the ge...
    neworderofjamie@gmail.com
    last modified by neworderofjamie@gmail.com
  • V1000 OpenCL

    have tried about 10 ways to get OpenCL working on a V1000 APU (V1404i to be exact) and can't get anything working. Worse, the documents that come with the AMD drivers on their website haven't change as the kernel and ...
    tsjorda
    last modified by tsjorda
  • AMDGPU-Pro OpenCL image3d_t bug

    Hey   When reading from an image3d_t (base type being short) with any sampler (any combination of allowed values), accessing the position using the overloaded read_imagei(image3d_t, sampler_t, float4), odd thing...
    somefoo
    last modified by somefoo
  • My application gets killed silently when using OpenCL on RX5700XT and RX5500XT

    Hi,   I'm the developer of PTGui (www.ptgui.com). In the past week I've received 3 reports from users that my application randomly quits without any notice (Windows does not show a crash notification). All ...
    joostn
    last modified by joostn
  • OpenCL runtime bug when sharing texture with D3D11

    I have a program that initializes 2 Direct3D11 textures. It then extract 2 cl_mem image objects from each texture and download its content to RAM.   When using DirectX11 device created via the new D3D11On12 API ...
    elad
    last modified by elad
  • OpenCL 2.0 compiler bug? (device side enqueue)

    A similar issue is reported here.   I compile a kernel (kernel1) that performs device-side enqueue to another kernel (kernel2). When kernel2 is empty, or contains little code, there is no problem.    ...
    elad
    last modified by elad
  • Looking for OpenCL Linux driver for Threadripper 3990X

    My Threadripper 3990X based servers (10-nodes) have arrived and I am looking for the OpenCL driver to run some benchmarks. The servers run Ubuntu 20.04 with NVIDIA GPU drivers. I googled and could not find a...
    FangQ
    last modified by FangQ
  • Bug in OpenCL runtime

    OpenCL runtimes for Windows x64 at least from 15.7.1 drivers on return garbage when queried by clGetKernelWorkGroupInfo with CL_KERNEL_PRIVATE_MEM_SIZE. If there is some spilled registers, then it returns their size i...
    melirius
    last modified by melirius
  • Newcomer - Can I Get Whitelisted for OpenCL Forum?

    Hello AMD!   I'm having a problem where my new Radeon VII is not being detected by clinfo for OpenCL/compute jobs, while my RX 580 still is.   A helpful user replied and let me know I should probably ask t...
    makeitwork
    last modified by makeitwork
  • Replacement of VOP2 versions of v_add / v_addc instructions on Navi

    Hello,   currently I am trying to improve my kernels by inserting assembly code - for Vega GPUs by using clrxasm or testing them on rocm with inline assembly and for Navi I am testing my codes by just inline asm...
    lolliedieb
    last modified by lolliedieb
  • AMD's OCL global work size with 2d work dimensions limit

    I have found a problem when executing a kernel the second dimesion of the work units get_global_id(1) get limited to around 120  whille the first dimension can execute every unit until the max set on the first di...
    pontiacgtx
    last modified by pontiacgtx