How to compile large GPU Programs

Discussion created by WTrei on May 25, 2009
Latest reply on May 26, 2009 by Jawed

In my program - a factoring tool for large integers - I have to multiply two 500 bit unsigned (emulated by 16 x 32-bit unsigned integers).

Becauce brook doesn't support local arrays and it is impossible to return more then a 128 bit value of a inlined kernel function, I had to write everythin manually  - unrolling every single loop! This results in a .br-file with about 500.000 lines of code!

The problem is: Using "brcc -P cal" results in one single .h file of about 380 megabyte - my gcc is not able to compile this ( virtual memory exhausted ) - I am using a 3 ghz Intel Core2Duo, 4gb of system memory and a Radeon 4850 on Ubuntu 9.04 (32 bit - but ggc always uses only 2,4 gbyte before breaking - so missing memory doesn't seem to be a problem).

Is there a way to split the .h file so I can compile?