• AMD APP SDK v3.0 Windows x64 Install Fails

    I've had zero success in installing this version of the SDK on any system I've tried.  Two different Win 7 machines and a Win 10 box all producing the same error. I referenced this post and have tried both the F...
    jjones12
    last modified by jjones12
  • Benchmarking float64 matrix multiplication performance

    My primary interest in GPUs is for "scientific computing", or more precisely speaking, float64 general matrix multiplications, also known as DGEMM. This is the speed determining factor in my applications - if DGEMM ru...
    drnil
    created by drnil
  • OpenCL CL_DEVICE_SIMD_PER_COMPUTE_UNIT_AMD returns 0 in amdgpu-pro 17.70

    my opencl code needs to use the stream processor number to estimate a default workgroup/workitem configurations. Using CL_DEVICE_MAX_COMPUTE_UNITS and CL_DEVICE_SIMD_PER_COMPUTE_UNIT_AMD, I was able to estimate the co...
    FangQ
    last modified by FangQ
  • amdgpu-pro DRM (DKMS modules) driver internal questions for debug purpose

    Hello,   First about: I'm an electronic engineer who currently interested in software programming, especially openCL. The open-source driver unfortunately doesn't support openCL, so I need the proprietary. But...
    krjdev
    last modified by krjdev
  • Floating-point atomic add, is this supported by OpenCL now?

    Floating point atomic operations are not supported, at least in OpenCL 1.2 or earlier, I have to use the atomic_xchg hack other people proposed. see   https://github.com/fangq/mcxcl/blob/master/src/mcx_core.cl#L...
    FangQ
    last modified by FangQ
  • How to run Open CL on old (and newer) AMD GPUs under Ubuntu Linux

    I'm writing this because I believe there is general interest in this subject. I'm not an expert, so errors and mistakes in this post are highly likely. Corrections and improvements are gratefully accepted. Credits for...
    drnil
    created by drnil
  • Beginning OpenCL Developer with APP SDK Installation Issue

    Hi everyone.  I just started working with OpenCL a few weeks ago using pyopencl on a laptop at work with an Intel CPU and Nvidia GPU, and am now trying to get to work on my home machine with an AMD FX 8150 CPU an...
    bmbachman
    last modified by bmbachman
  • clBuildProgram failure

    The clBuildProgram fails to compile some spir binaries using the following option: " -x spir -spir-std=1.2" Intel OpenCL platform runs with this binary flawlessly. Old amd drivers (prior to v17.7 or so) worked fine f...
    ivan
    last modified by ivan
  • Does Vega expose its graphics caches(L1) as __constant or __local for OpenCL? How can it be used for higher L1 capacity(and bandwidth)?

    I don't have Vega but it would help me optimize my opencl programs for it(for Vega users).   Thank you for your time.
    tugrul_512bit
    last modified by tugrul_512bit
  • OpenCL -O2 makes program incorrect: Help!

    Hello AMD Forums,   This is my first post and I'm also new to OpenCL. Hopefully my mistake here isn't too "n00b". At the core, my program functions perfectly well at -O0 optimization level, but as soon as I go t...
    dragontamer
    last modified by dragontamer
  • OpenCL Multi GPUs

    Dears,   I'm looking for en example for multi gpu handling, I see SimpleMultiDevice example showing how to create multiple command queues on single context and where the input data is split into two halves with ...
    peter_cz
    last modified by peter_cz
  • Happy Holidays to RTG team !!

    Thanks for all of your hard work this year, and looking forward to an even more awesome 2017.   Aaron
    boxerab
    created by boxerab
  • OpenCL is ~twice faster than SPIR version

    Hi! My problem is the same as Slow SPIR   But I think that my problem is with VGPRs usage. Under SPIR they are used much more. How can I investigate to workaround this behaviour? I can provide executable with O...
    polarnick
    last modified by polarnick
  • Erroneous GPU behavior for atomic_add and atomic_sub

    While running some OpenCL kernels on R7-240 and RX-550, I see this behavior:   float w_rnd(__global unsigned int * intSeeds,int id)             ...
    tugrul_512bit
    last modified by tugrul_512bit
  • OpenCL max number of contexts

    Hi, On NVidia, there is the following limit: You can create maximum 32 processes where each one creates an OpenCL context. What is the limit for AMD GPUs?   Regards, Tomer Gal, CTO at OpTeamizer
    tomer_gal
    last modified by tomer_gal
  • Driver crash on OpenCL compiling for old GPUs

    Some time ago our users encountered driver crashes in amdocl12cl64.dll on OpenCL compilations for some old GPUs like these: - HD 7570 (Turks) - HD 7500G (Devastator) - ATI FirePro V7800 (FireGL) (Cypress) - AMD Ra...
    polarnick
    last modified by polarnick
  • clReleaseCommandQueue hang in Windows driver (no events)

    Some of my users are seeing hangs in the AMD OpenCL drivers, for example driver version 17.7.2 with an AMD RX 480.   Platform version: OpenCL 2.0 AMD-APP (2442.8) Platform profile: FULL_PROFILE Platform name:&...
    skuto
    last modified by skuto
  • SPIR binary linkage with AMD APP

    Hi,   Can AMD APP link SPIR binaries? The APP User Guide mentions clCreateProgramWithBinary and clBuildProgram but not clCompileProgram and clLinkProgram in section G.1. I can't tell if these will work.   ...
    lewissall@hotmail.com
    last modified by lewissall@hotmail.com
  • Avoid L1 cache pollution on GCN

    I have a kernel that writes results to a global buffer; these results are never read back into the kernel (they are processed by another kernel at a later time). So, I don't want this data sitting in the L1 cache if ...
    boxerab
    last modified by boxerab
  • Optimizing GPU occupancy and resource usage

    Fantastic article by Sebastian Aaltonen on optimizing VGPR usage on GCN:   https://gpuopen.com/optimizing-gpu-occupancy-resource-usage-large-thread-groups/   Does anyone have any other tricks to add here ?
    boxerab
    created by boxerab