Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

Journeyman III

Memory leak in fglrx.ko (regression in Catalyst 11.2)

While upgrading a cluster of 64-bit Linux machines from Catalyst 10.12 to 11.3, all running my CAL (not OpenCL) whitepixel application, I found a regression that I later tracked down was introduced in the intermediary Catalyst 11.2 version: fglrx.ko leaks memory at a rate of about 10-30 MB per minute as reported by free(1). I know it is the fglrx.ko module because the resident memory used by X11 and my app is pretty much constant as seen in top(1).

My machines have 2GB RAM. The leak causes the amount of memory used by fglrx.ko to grow up to about 1.5GB after a few hours. The behavior is then erratic: either the application SIGSEGV, or hangs, or more surprisingly some machines appear to continue running fine after plateau'ing at 1.5GB (but the compute shaders could be behaving incorrectly: threads returning prematurely, etc - I need to check that).

AMD engineers: it should be simple for you to reproduce. whitepixel is an open source app. Download it and run it as per the README. The app compiles the CAL kernel once, and then calls calCtxRunProgramGrid() and calIsEventDone() indefinitely in a loop to process work items. Please advise. In the mean time I have reverted to 11.1, which was the last non-leaky driver.

3 Replies

Thank you for reporting this. I have let the CAL team know about this issue.

If you could answer some questions, it would help a lot.

1) Do I just test with any test string?
I used
* a simple one, which was decoded before I could observe any memory leak
* a complex one which gave "Exhausted search space"
* default - just ./whitepixel which gave "Exhausted search space"

2) Why/when does it output "Exhausted search space", and what test do I run to
get a reasonable running time?

3) I did use valgrind to check memory leak, and tried to observe memory
consumption by the program grow on top. But I din't see 10-30 MB per min growth
like what you described in your post. Is there something specific I must
do/test to reach the point you did?

4) Was your program running on a single desktop with a single GPU/dual GPU? Can
you please describe the hardware on your system, including what GPU model you

I misspoke when reporting this bug: it affects another one of my applications, not whitepixel.

I have replied privately to one of the AMD employees who emailed me (Prateeksha).