Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

Adept III

GCN Assembler for Linux is available!

Hi. I released a  complete and rich GCN assembler for Linux/Unix (and Windows).

It is here: CLRadeonExtender

Documentation (ugly, but it is): ClrxToc – CLRadeonExtender

Downloads: ClrxDownloads – CLRadeonExtender

This is early alpha version that can be buggy and have ugly documentation.

This package provides:

- a complete GCN assembler (support GCN 1.0/1.1/1.2, full Fiji support!)

- a complete GCN disassembler (this same support as assembler)

- CLRXWrapper (embeds assembler to AMD OpenCL implementation), just try!

- documentation

- doxygen documentation

- no samples (not yet)

Package can installed in Linux/Unix and Windows systems.

Assembler and disassembler supports two binary formats:

* the AMD Catalyst driver OpenCL binary format (opencl 1.2)

* the GalliumCompute (yes, this OpenSource drivers, Mesa3D, LLVM, and Gallium!) binary format (the first GCN assembler that support opensource drivers)

Allow you to write programs for AMD OpenCL implementation and for the opensource radeon drivers (we proudly support open source world!)

Rich features:

* compatible with GNU as

* macros, nested macros

* includes (nested even in macros)

* symbols assignment, assignment expressions to symbols (.eqv pseudo-op)

* conditional compilation (.if/.elseif/.else/.endif)

* repetitions (.rept, .irp, .irpc)

* many kernels support

* GPU/binary format/bitness detection (.ifgpu/..ifngpu)

* AMD OpenCL driver detection (.get_driver_version).

* standalone binary generator for AMD Catalyst (works fine with latest drivers!) and GalliumCompute

* easy-to-use kernel configuration (.config) for two binary formats

* supports all encodings for GCN 1.0/1.1/1.2 (Fiji support)

* floating point suport (literals, .float/.double), support also half floating points

12 Replies
Adept II

Gz! Great work! I'm not sure if I'm crazy skilled enough to use assembler in OpenCL kernels It would be great to see some examples!


GZ! What will be the first project in it?

@haahh -> I dont't have too much working example projects either. But the best source of up_to_date examples is the OpenCL compiler itself.


Good job! Thank you. Nice to have the tools for direct access to the GPU, when needed 🙂

Builds fine in Ubuntu 14.04. Just can't find OpenCL during cmake. Looks for path /opt/AMDAPP, whereas it should look for /opt/AMDAPPSDK-*.

Small problem, can update it in CMakeCache.txt:OPENCL_DIST_DIR

Also in sudo make install it cannot find build/doc to install docs.

CMakeLists.txt:149 install path for docs should be changed from ${PROJECT_BINARY_DIR}/doc/ -> ${PROJECT_BINARY_DIR}/clrxdoc/

-> clrxdisasm -a fft_Pitcairn > out

-> clrxasm out

-> diff a.out fft_Pitcairn
Binary files a.out and fft_Pitcairn differ

Actually there are a lot of differences between those 2 files. However, they work just the same with ocl 1.2:-)

Using Ubuntu 14.04 x64, with SDK3.0 and Radeon R9 270.

I have some concern about people developing commercial projects and shipping out those binary images...


thank you for testing. this is still alpha version and It can have many bugs. I don't have access to GCN1.1 device, so I didn't test assembled/generated files for the GCN1.1 arch.


These were not bugs really, just some easy configuration issues to help people move along.

Overall quite polished and easy to run.

Now, anyone with a GCN1.1 card for some easy testing?

Adept III

I wrote one simple sample: vectorAdd:

Kernel setup is pretty simple (for AMD Catalyst, just you should set dimensions, arguments and userdatas, just few lines).

I added new features: the register's symbols and indexing symbol by expressions (like: v) and I fixed few silly bugs.

Next release (0.1.1) wil be published in next one/two weeks.


A couple of unknown instructions and lots of unknown modifiers, unterminated strings and garbage at the end. Probably the new features you added.

We have to split assembly from host code.

About samples/examples: I thought that they are easy to generate. Everyone can generate them from their binary images. That might be more instructive, too, since they are from our own programs that we have worked with and know them. In fact, best is to compare source code with assembly side by side.

The assembly annotation is nice to have, but without the source code, only marginally helpful...:(


CLRX is still in early stage (alpha). But, can you explain your problems with that software? Or is it a attempt to compiling incorrect code ?

Current version (from trunk) passes all tests (gcn instruction encoding/decoding), so an assembler should work pretty good.


I'm not sure what you mean. I have no problems working with my binary image. But I got a lot of errors assembling your linked code (assembly portion only). I thought you already knew about it...I better wait for your updated version b4 testing again. Not much sense in fixing the older version

Adept III

Now, CLRadeonExtender is compilable under MSVC 2015 and Windows. Main tests has been passed. Try sources from svn trunk.

Adept III

Now is available version 0.1.1 in my site. New features are:

* support for Windows

* register ranges, symbols with register ranges (you can name registers)

* fixed Gallium and AMD Catalyst binary generator

* fixed clrxasm

* GCN ISA documentation

* many fixes

Support for AMD OpenCL 2.0 kernel binaries will be in next version (0.1.2). The new binary format differs significantly and it requires quite lot of works.

I already integrated this GCN assembler into my tripcode  generator, and it works great! Now I can dynamically generate GCN assembler sources at run-time and generate binaries on the fly with this assembler with 10% increase in performance compared to regular OpenCL kernels. I cannot wait for the next version with OpenCL 2.0 support. I am more than happy to help you with testing with two 7990s (GCN 1.0) and two 290Xs (GCN 1.1).