• Experimental OpenCL driver for Ubuntu 16.04

    On Ubuntu 16.04, fglrx support was dropped and currently the opensource driver AMDGPU is still in an early stage and only supports very limited devices. I attempted to make the OpenCL part of fglrx working and made a ...
    victzhang
    last modified by victzhang
  • Bug in OpenCL compiler

    I found a bug in OpenCL compiler in the latest drivers. At least it is persistent in Adrenalin 19.5.2 and 19.8.1. Minimal reproducing example is included in the file. It just multiplies several complex numbers in a lo...
    melirius
    last modified by melirius
  • Strange printf behaviour on Vega

    Tested on latest 19.10.1 drivers. Windows 10 x64 1903 I attached cl file and cpp program which would launch this simple addVec kernel. Opencl code: #pragma OPENCL EXTENSION cl_amd_printf : enable __attribute__((req...
    ___
    last modified by ___
  • OpenCL 8 GPU DGEMM (5.1 TFlop/s double precision). Heterogeneous HPL (High Performance Linpack from Top500).

    Pavel Bogdanov, Institute of System Research Russian Academy of Sciences (NIISI), bogdanov@niisi.msk.ru   INTRO   Nowadays heterogeneous computing becomes more and more popular. In november 2011 three of to...
    antonyef
    last modified by antonyef
  • OpenCL occupancy-performance nightmare

    These days I tried to squeeze some performance from a memory-intensive OCL kernel and went for GCN assembly. Saved a few registers here, few instructions there, got a nice occupancy and thought to have a perfect kerne...
    kbala
    last modified by kbala
  • Kernel execution time discrepancy

    I have a kernel that executes a few times per second. There are 2 anomalies that I can't figure out.   1 (less important). Every 3-4 seconds the gap between the end of the kernel execution and the start of the n...
    kbala
    last modified by kbala
  • Optimize LC0 - Leela Chess Zero - for AMD GPUs

    Heyho AMD community,   we are all aware about the neural network hype on gpus, and most have noticed that Nvidia has simply the forehand with their cuDNN framework.   Personally I am convinced that AMD mak...
    smato2018
    created by smato2018
  • Newbie introduction and seeking guidance.

    Hello my name is Ernst. I have a passion for binary level data encoding.  I work on my workstation running a AMD 9590 and recently I made the decision to upgrade my NVIDIA GPU to twin Radeon wx 5100s pros'. I am...
    ernst0
    last modified by ernst0
  • Need tips to hide memory latency - 20x speed-loss when writing to memory

    I have an OpenCL code that does Monte Carlo photon transport simulations in a voxelated space (https://github.com/fangq/mcxcl). The code involves simulating a large number of random photon trajectories, each in a thre...
    FangQ
    last modified by FangQ
  • Running OpenCL Work Groups with >256 Elements

    Hi all,   I am currently re-writing some OpenCL code of mine and would like to split the work of the group to more waves in order to have more waves in flight. The code is a OpenCL 1.2 code (because it needs to ...
    lolliedieb
    last modified by lolliedieb
  • Opengl interop - chosen wrong device but works

    Hello. I have quite old hardware in my laptop with switchable graphics - Intel HD 4000 and Radeon HD 7670M. Switchable graphics works but behaves unexpectedly. I have following code to choose opencl device for textu...
    omega_doom
    last modified by omega_doom
  • OpenCL compilation hangs forever

    Hi all,   I am trying to compile this project for an AMD GPU: GitHub - webmaster128/lisk-vanity: A tool to generate short Lisk addresses with GPU support   The c.l files are in lisk-vanity/src/opencl at m...
    webmaster128
    last modified by webmaster128
  • Kernel runs slower for local workgroup size greater than 64

    Hi bros, I'm a CS undergraduate student and I recently wrote a GPU path tracer using OpenCL. If you don't know what path tracing it's basically a method to generate photorealistic images by shooting rays through every...
    gallickgunner
    last modified by gallickgunner
  • Processing two buffers using an out of order queue

    I have a PCI data acquisition card that supports P2P. It will be capturing records one after the other at a very rapid rate, and the plan is to write each record to the GPU using DirectGMA, where a kernel will process...
    andyste1
    last modified by andyste1
  • Realtime raytracing with opencl

    Is it possible to achieve realtime raytracing like RTX with opencl?
    cyseal
    created by cyseal
  • List of neural network/machine learning/GPU computing apps that support OpenCL acceleration on AMD Fx HW?

    Hi, I have a few questions. I hope you can help me.   I am trying to learn neural nets/ML on my older, Fx based hardware.   I very much prefer the openCL development model. As discussed elsewhere, people ...
    devlista
    last modified by devlista
  • Modern GPU: book-length tutorial and open-source GPGPU library

    Fastest GPU radix sort and scan-centric tutorial. Hi all. I've been putting together a big book-length online GPU computing tutorial. It's at http://www.moderngpu.com/ The content is very scan/reduction-centri...
    BarnacleJunior
    last modified by BarnacleJunior
  • Please add new extension for refined reduce in wavefront

    Hi,   According to https://gpuopen.com/amd-gcn-assembly-cross-lane-operations/  the hardware is able to do refined reduce operations.   By 'refined', I have in mind doing an add/min/max among neighbor...
    mannerov
    last modified by mannerov
  • Strange behavior of a kernel, need fresh ideas

    I lost two days debugging or better to say tried to debug my kernel. Basically the kernel looks like this (part of dagger-hashimoto initialization):   1. copy from global to private 2. do private 3. copy from ...
    kbala
    last modified by kbala
  • Missing OpenCL CPU support under Windows

    On my system the (i think) most recent version of the AMD drivers (18.8.1, Windows 10 x64) no longer returns the CPU (FX-8350) as a valid OpenCL device.  Is this intended behavior or just a bug in my specific ins...
    pangea
    last modified by pangea