cancel
Showing results for 
Search instead for 
Did you mean: 

AI Discussions

yamsyamsyams
Journeyman III

Ryzen 9 5900x - Machine Learning Training

Hi,

 

I'm not sure where to post this and I'm not sure if anyone else experiences this - so please let me know what other information I need to provide etc.

 

I work as a software developer and we train ML models. I was using scikit-learn's MLP library fine in training models and also TensorFlow v2.10. Whenever I try to use a higher version of TF or PyTorch for example, it will train for a few seconds and completely crash my computer. I have a suspicion this has something to do with my CPU. I do notice my CPU goes to 100% usage once it starts to train with either of the libraries I mentioned, and then it will crash after a few seconds.

 

Any help would be appreciated.

0 Likes
1 Reply
joseph5u
Journeyman III

It seems like your CPU might be the bottleneck, especially since it's maxing out at 100% during training before your system crashes. A couple of things to consider: when the CPU is under that much load, it could be overheating, causing an automatic shutdown to prevent damage. I’d recommend using a monitoring tool like HWMonitor to check both CPU temperature and usage in real-time during the process. You might also want to experiment with smaller models or batches to see if the crashes persist. If you’re not already using a dedicated GPU, offloading the training from CPU to GPU could significantly reduce the strain on your system. Lastly, ensure your system's cooling is adequate—dust buildup or poor ventilation can sometimes be the hidden culprit in these cases. Let me know if you need more detailed troubleshooting steps!

0 Likes