cancel
Showing results for 
Search instead for 
Did you mean: 

Discussions

Janissaire
Adept I

Rocm 6.0.0 tensorflow not working on RX 7900 XT

Dear All,

I am trying to use tensorflow-rocm (version 2.13.0.570) to run in Jupyter on Ubuntu 22.04.

When I check if tensorflow "sees" the GPU using :

print(tf.config.list_physical_devices('GPU'))

I get the following message :

2024-01-06 12:19:43.135928: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2024-01-06 12:19:43.215412: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2024-01-06 12:19:43.215471: I tensorflow/compiler/xla/stream_executor/rocm/rocm_gpu_executor.cc:838] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2024-01-06 12:19:43.215488: I tensorflow/core/common_runtime/gpu/gpu_device.cc:2015] Ignoring visible gpu device (device: 0, name: Radeon RX 7900 XT, pci bus id: 0000:0b:00.0) with AMDGPU version : gfx1100. The supported AMDGPU versions are gfx1030, gfx900, gfx906, gfx908, gfx90a, gfx940, gfx941, gfx942.

What I understand is that this version of tensorflow doesn't work with my GPU.

Is tensorflow 2.13.0.570 compatible with the RX 7900 XT? And if so, what do I need to do to make it work.

I tried uninstall/reinstalling rocm, tensorflow-rocm, ...

10 Replies
BlueFlow
Journeyman III

I have the same issue with the next version they posted, tensorflow-rocm 2.14 for Rocm 6.0 In this case it seems there is a typo, as the error reads:

Radeon RX 7900 XTX, pci bus id: 0000:03:00.0) with AMDGPU version : gfx1100. The supported AMDGPU versions are gfx1030gfx1100, gfx900, gfx906, gfx908, gfx90a, gfx940, gfx941, gfx942.

It seems they forgot to include a space between gfx1030 and gfx1100. Please AMD, fix this problem.

0 Likes

I installed version 2.14.0.600 and I have the same issue as you :

Ignoring visible gpu device (device: 0, name: Radeon RX 7900 XT, pci bus id: 0000:0b:00.0) with AMDGPU version : gfx1030. The supported AMDGPU versions are gfx1030gfx1100, gfx900, gfx906, gfx908, gfx90a, gfx940, gfx941, gfx942.

AMD : please add the missing comma.

I wonder whether we can recompile tensorflow-rocm from scratch to correct this typo manually.  I am searching for a guide, so if you see anything, please let me know. Thanks!

Today I wrote a letter to AMD with the problems I found. I tested the solution to this problem on two versions of Tensorflow.
I compiled two versions from source. Everything went well. Below, in the second paragraph there is a solution.

Here is my letter to AMD:
Good afternoon.
You have many problems with ROCM and documentation.
These are some of the issues that have been identified that have a huge impact!
1) On the page https://github.com/ROCm/tensorflow-upstream/blob/develop-upstream/rocm_docs/tensorflow-rocm-release....
The link leads to a non-existent version:
https://pypi.org/project/tensorflow-rocm/2.14.0.602

2) There is a problem with a typo in the source code (a comma is missing), which can be solved like this:
sed -i 's/"gfx1030" /"gfx1030",/g' tensorflow/compiler/xla/stream_executor/device_description.h
Otherwise, there will be an error when using ROSM with a 7900 xthx video card.
Here is a link to fix it:
https://gist.github.com/briansp2020/1e8c3e5735087398ebfd9514f26a0007

This problem is being discussed on your forum and no one has been able to solve it since January!
https://community.amd.com/t5/discussions/rocm-6-0-0-tensorflow-not-working-on-rx-7900-xt/m-p/657519

0 Likes

Hello, I ran into a similar problem, the error is I tensorflow/core/common_runtime/gpu/ gpu_deveid.cc :2266] Ignoring visible gpu device (device: 0, name: AMD Radeon RX 6800 XT, pci bus id: 0000:0a:00.0) with AMDGPU version: gfx1030. The supported AMDGPU versions are gfx1030gfx1100, gfx900, gfx906, gfx908, gfx90a, gfx940, gfx941, gfx942. And I use the way you recommend sed -i 's/" gfx1030 "/" gfx1030, "/ g' tensorflow/compiler/xla/stream_executor/device_description h, I ran the code again after rebooting the system and still encountered errors. Do I need to reinstall, recompile or other fixes?

0 Likes
Janissaire
Adept I

I was thinking the same thing. 🙂

0 Likes
Janissaire
Adept I

any news from AMD?????

0 Likes
Duchy
Adept I

It's a bummer this still hasn't been fixed...

0 Likes
Janissaire
Adept I

yep tried it yesterday, uninstalled and reinstalled it just to make sure.

Same thing......

I though AI was a big deal....

0 Likes
qarakhan
Journeyman III

any news from AMD right now?

or tell me how can i add the missing comma, i will try it.

Was buying the 7900xtx the wrong choice?

0 Likes