cancel
Showing results for 
Search instead for 
Did you mean: 

OpenCL

timchist
Elite

What is 'xnack' feature and how to get more information about it

Recent AMD drivers have changed the way certain GPUs are reported. For example, RX 5500 XT is detected as 'gfx1012:xnack-" (used to be 'gfx1012'). Newer drivers no longer list 'gfx1012' as a target available for offline compilation, but do list 'gfx1012:xnack-' and 'gfx1012:xnack+'.

Could you please clarify what xnack is, what was changed in the recent drivers and whether the binaries generated for xnack-/xnack+ are compatible with older drivers (which recognized the device as 'gfx1012'). In fact, which binary should be used with older drivers: the + or the - one? Also, is the binary generated with an older driver (for 'gfx1012') compatible with devices recognized as either 'gfx1012:xnack-' or 'gfx1012:xnack+' by newer drivers?

What is xnack feature and where can I get more information about it? Is it possible to enable/disable it for a given GPU+driver?

1 Solution

As per my understanding from this old discussion and this gpu table , the "xnack" feature is mainly enabled on the APU devices. As I know, currently most of the gpus based on Navi are dGPUs. That might be reason why you have only seen gfx101x:xnack- devices.

 

Thanks.

 

View solution in original post

12 Replies
dipak
Big Boss

Here are couple of links which provide some useful information about the  "xnack" feature.

https://llvm.org/docs/AMDGPUUsage.html#target-features

https://rocmdocs.amd.com/en/latest/ROCm_Compiler_SDK/ROCm-Native-ISA.html#target-features

"xnack" target feature:  It is used to enable/disable generating code that has memory clauses that are compatible with having XNACK replay enabled. This is used for demand paging and page migration. If XNACK replay is enabled in the device, then if a page fault occurs the code may execute incorrectly if the xnack feature is not enabled. Executing code that has the feature enabled on a device that does not have XNACK replay enabled will execute correctly, but may be less performant than code with the feature disabled.

 

Regarding the compatibility related query, can you please provide more details about the driver and OS? I'll check with the OpenCL team if I can get any information on this.

 

Thanks.

0 Likes

Thanks for the links, dipak.

Are there any documents describing Xnack in more details as well as providing the list of GPUs where this feature is enabled? Can it be configured programmatically (i.e. via Control Panel or driver settings) for a given GPU or it's an inherent characteristics of a certain model and cannot be changed by the user?

> Regarding the compatibility related query, can you please provide more details about the driver and OS? I'll check with the OpenCL team if I can get any information on this.

Windows 10. Drivers 21.2.1 (old) and 21.4.1 (new). Can we use binaries for gfx1010 generated offline by the old driver with the new driver which detects the GPU as 'gfx1010:xnack-'? What about 'gfx1010:xnack+'? Can we use the 'gfx1010:xnack-' or 'gfx1010:xnack+' binary generated with 21.4.1 on a system running 21.2.1?

0 Likes

Thank you for providing the above information. I'll check with the OpenCL team regarding the "xnack" related queries and let you know if I get any information on this.

 

> Are there any documents describing Xnack in more details as well as providing the list of GPUs where this feature is enabled?

Looks like this AMDGPU Processor table provides similar information like the GPUs where "xnack" target features is supported and, if so, its default value. This support may be different on other platform or driver, which I'm not sure though.

Thanks.

 

 

Thanks.

0 Likes

> Looks like this AMDGPU Processor table provides similar information like the GPUs where "xnack" target features is supported and, if so, its default value.

Is there a way to change the default value? If so, how does one do it?

0 Likes

>Is there a way to change the default value? If so, how does one do it?

As I've come to know, on some targets there is a way to configure the Linux boot BIOS to set the XNACK setting. However, on Windows, it seems like the XNACK configuration is fixed per target and currently it doesn't support to change the settings.

Now coming to the compatibility query. From the feedback, it looks like the "code object V4" is going to be the default code version generated by the compilers (https://llvm.org/docs/AMDGPUUsage.html#elf-code-object). With ROCm 4.1 release, this is already supported [roc-4.1.x#targetid-for-multiple-configurations]. On windows, they are also updating the PAL implementation to include this support. 

So, I guess using two different driver versions may cause a compatibility issue for some cases, particularly when you generate the offline binary with a newer driver and try to run it with an older driver that doesn't support the new binary format. Below is the related feedback:

"We have worked hard to make the ROCr runtime and PAL runtime be able to load all earlier versions of code objects. But we want to move the compilers to generate the V4 version by default. That means runtimes older than ROCm 4.1 and an upcoming PAL version will not be able to run them. "

By the way, the code object V4 supports default "XNACK"  value as "ANY". So, if no "XNACK" target feature is specified, the generated code can be loaded and executed in a process with either setting of XNACK replay (but may be less performant) as mentioned here:
"If a target feature is not specified, it defaults to a new concept of "any". The compiler, then, produces code, which executes on a target configured for either value of the setting impacting the overall performance."

 

Thanks.

0 Likes

Thanks for the detailed reply, Dipak. This is consistent with my own findings.

> By the way, the code object V4 supports default "XNACK" value as "ANY". So, if no "XNACK" target feature is specified, the generated code can be loaded and executed in a process with either setting of XNACK replay (but may be less performant) as mentioned here:

Unfortunately there is no way to specify ANY during offline compilation. For example, for gfx1010 the choice of devices includes 'gfx1010:xnack+' and 'gfx1010:xnack-'. And there seems to be no way to specify/override additional options. Can one pass something in options to clBuildProgram in order to compile with a different xnack settings than the one included in the device name?

Also, the following is not yet fully clear:

1. I compile a binary for 'gfx1010' with an older driver, in offline mode. Windows 10.

2. I try to use this binary on a gfx1010:xnack- device running a newer driver. Will this work? It does seem to work correctly in our direct tests.

3. I try to use the same binary on a gfx1010:xnack+ device running a newer driver. Will this work? We don't have any xnack+ devices on hand thus cannot test directly. Could you confirm?

0 Likes

>Unfortunately there is no way to specify ANY during offline compilation.

As mentioned above, if a target feature is not explicitly specified, compiler treats the default value as "any" and generates code object which can run on a target configured for either value of the target feature. For example, if target id is specified as "gfx1010", then the produced code is expected to run on both the targets: gfx1010:xnack+ and gfx1010:xnack-

>3. I try to use the same binary on a gfx1010:xnack+ device running a newer driver. Will this work? 

Yes, it is expected to work.

For more information, you may refer the "xnack" related description mentioned here:  https://llvm.org/docs/AMDGPUUsage.html#target-features

 

Thanks.

 

0 Likes

> As mentioned above, if a target feature is not explicitly specified, compiler treats the default value as "any" and generates code object which can run on a target configured for either value of the target feature. For example, if target id is specified as "gfx1010", then the produced code is expected to run on both the targets: gfx1010:xnack+ and gfx1010:xnack-

There seems to be no way to specify features when compiling via clBuildProgram. It accepts the list of devices to compile for as cl_device_id*. The list of available cl_device_id's is originally returned by clGetProgramInfo. For the newest drivers this list includes 'gfx1010:xnack+' and 'gfx1010:xnack-' (as reported by clGetDeviceInfo(...CL_DEVICE_NAME...)), but does not include 'gfx1010'. So I cannot figure out how to 'not explicitly specify a feature'. In fact, I see no way to specify any features during compilation via OpenCL API, unless you advice a way to do it, for example, by passing something to clBuildProgram.

0 Likes

The "target feature" settings that was referred earlier is related to the AMDGPU backend LLVM compiler. Usually the information about the target is obtained from the runtime (which is PAL on Windows), including the setting of the XNACK, and that information is passed to the LLVM compiler and recorded in the code object and used by the loader. 

Currently it seems like the OpenCL offline compilation via clBuildProgram doesn't support this option on Windows.

Below are my suggestions regarding the OpenCL offline binaries.

1. If not sure about the target "xnack" settings, ship two different offline binaries (e.g. gfx1010:xnack+ and gfx1010:xnack- ) for a target device (e.g. gfx1010) and load the binary which best matches the target settings (for example, with matching CL_DEVICE_NAME).

2. Try to run the gfx1010:xnack+ offline binary on a gfx1010:xnack- device. If it runs successfully, check the performance compared to the gfx1010:xnack- binary. If that is acceptable, then I think gfx1010:xnack+ binary can be used on a gfx1010 device with any "xnack" settings. Because gfx1010:xnack+ binary is expected to work fine on a gfx1010:xnack+ device.

 

Thanks.

0 Likes

> 1. If not sure about the target "xnack" settings, ship two different offline binaries (e.g. gfx1010:xnack+ and gfx1010:xnack- ) for a target device (e.g. gfx1010) and load the binary which best matches the target settings (for example, with matching CL_DEVICE_NAME).

> 2. Try to run the gfx1010:xnack+ offline binary on a gfx1010:xnack- device. If it runs successfully, check the performance compared to the gfx1010:xnack- binary. If that is acceptable, then I think gfx1010:xnack+ binary can be used on a gfx1010 device with any "xnack" settings. Because gfx1010:xnack+ binary is expected to work fine on a gfx1010:xnack+ device.

Thanks. Both of the above might work indeed.

Are there any Radeon GPUs on the market that will be detected on Windows as 'gfx1010:xnack+' or 'gfx1012:xnack+'? So far we have only seen 'xnack-'.

 

0 Likes

As per my understanding from this old discussion and this gpu table , the "xnack" feature is mainly enabled on the APU devices. As I know, currently most of the gpus based on Navi are dGPUs. That might be reason why you have only seen gfx101x:xnack- devices.

 

Thanks.

 

Now that^ was really helpful. Thank you.

0 Likes