• OpenCL development documentation on AMD GPUs

    Is there a publicly available list of all AMD GPUs supporting OpenCL which includes: product name ('AMD Radeon RX Vega 64') internal name ('gfx900', can be obtained as CL_DEVICE_NAME) architecture ('GCN gen 5') ar...
    timchist
    last modified by timchist
  • Heterogeneous toolchain for Windows?

    Good day,   I am currently running windows with OpenCL kernels across CPU(2990WX) and AMD GPU with C++17.   As AMD stopped support for OpenCL on CPU how can I adapt my tool-chain to still leverage the CPU ...
    genestoltz
    last modified by genestoltz
  • AMD GPU OpenCL get wrong results while Nvidia correct

    Recently, I translated a CPU code into OpenCL, and it has been debugged and tested (using GTX1060). The calculating process of this code is an iteration process. The calculating results are presented in the form of re...
    huzhiyuan1994
    last modified by huzhiyuan1994
  • clGgetDeviceIDs suddenly very slow

    We are currently developing an OpenCL application on Windows 10 (Visual Studio 2017) but have noticed that the OpenCL performance has recently degraded, with the call to clGetDeviceIDs now taking around 10 second...
    andyste1
    last modified by andyste1
  • How to abort clEnqueueWaitSignalAmd?

    We're developing software that uses a PCI data acquisition card to read blocks of data (records) from an external instrument. These records are transferred to a Radeon Pro WX7100 using "DirectGma", where a kernel proc...
    andyste1
    last modified by andyste1
  • OpenCL occupancy-performance nightmare

    These days I tried to squeeze some performance from a memory-intensive OCL kernel and went for GCN assembly. Saved a few registers here, few instructions there, got a nice occupancy and thought to have a perfect kerne...
    kbala
    last modified by kbala
  • Optimize LC0 - Leela Chess Zero - for AMD GPUs

    Heyho AMD community,   we are all aware about the neural network hype on gpus, and most have noticed that Nvidia has simply the forehand with their cuDNN framework.   Personally I am convinced that AMD mak...
    smato2018
    created by smato2018
  • Radeon vii and fft

    Hello, is there by any chance a recommended  ocl package of ffts for radeon vii? clfft was coded for previous generations of cards. --
    dns.on.gpu
    last modified by dns.on.gpu
  • GPUs: pick-n-mix

    Hello.   Is it possible to use ocl with 2 of more different gpus under linux? I am interested in mixing two Rad_vii, with two 280x and even one or two 7950. --
    dns.on.gpu
    last modified by dns.on.gpu
  • What's the best or the recommended way to copy the data from scalar registers to GDS?

    Perhaps, there's something that I'm not seeing in the docs, so I apologize in advance.   I've got 16 dwords in scalar registers s16-s31. I need to copy that data from the scalar registers to GDS at the GDS base ...
    sp314
    last modified by sp314
  • Getting stuck in a loop, does local variable not visible to other workitems in a work group?

    This is my kernel code: __kernel void test(__global int *input_vector,__global atomic_int *mem_flag) {     local int d[32];     if(get_local_id(0)==0) {      &#...
    avinashkrc
    last modified by avinashkrc
  • clEnqueueAcquireD3D11ObjectsKHR blocks for a long time

    In my application, I have a processing thread that enqueues an OpenCL kernel that writes to a ID3D11Texture2D object.   Everything works fine in terms of correctness. I can successfully acquire the shared O...
    elad
    last modified by elad
  • I am trying to testout how well atomicity performs on APU. But my sample program hangs the system

    I am trying to testout how well atomicity performs on APU. But my sample program does not update the variable properly hence whole system hangs as I check for updated value at either side (cpu and gpu)  in while ...
    avinashkrc
    last modified by avinashkrc
  • OpenCL 64 bit atomics under Vega 8 Integrated Graphics on Win10 ?

    I am working to compile an OpenCL program which needs 64bit atomics (atomic_xchg and atomic_add, with long datatype). I have added " #pragma OPENCL EXTENSION cl_khr_int64_base_atomics : enable" and the ...
    glupescu
    last modified by glupescu
  • Need tips to hide memory latency - 20x speed-loss when writing to memory

    I have an OpenCL code that does Monte Carlo photon transport simulations in a voxelated space (https://github.com/fangq/mcxcl). The code involves simulating a large number of random photon trajectories, each in a thre...
    FangQ
    last modified by FangQ
  • host-device latencies?

    Doing recently some benchmarks and wonder if my host-device latencies are bound to my older hardware or are similar on newer systems?   OS: Ubuntu 18.04 x86-64 Device: AMD Radeon HD 7750   OpenCL gpu kerne...
    smato2018
    last modified by smato2018
  • Error code -2 (Device not availaible) when running clCreateContextFromType

    Hello Everyone,   I'm currently retesting some OpenCL code and I recently had a problem on my code. When I'm trying to get the device list on my computer with the C++ Wrapper function ... I get a error...
    fyfy
    last modified by fyfy
  • Running OpenCL Work Groups with >256 Elements

    Hi all,   I am currently re-writing some OpenCL code of mine and would like to split the work of the group to more waves in order to have more waves in flight. The code is a OpenCL 1.2 code (because it needs to ...
    lolliedieb
    last modified by lolliedieb
  • OpenCL: Delay in inter-kernel execution when requesting callbacks

    Hi I have a problem with delays in kernel execution when I request callbacks from OpenCL. In my application, I need to execute kernels at a "very" high rate (around 300Hz), and I need a callback to my host applicati...
    nfogh
    last modified by nfogh
  • Kernel runs slower for local workgroup size greater than 64

    Hi bros, I'm a CS undergraduate student and I recently wrote a GPU path tracer using OpenCL. If you don't know what path tracing it's basically a method to generate photorealistic images by shooting rays through every...
    gallickgunner
    last modified by gallickgunner