cancel
Showing results for 
Search instead for 
Did you mean: 

OpenCL

Highlighted
Adept I
Adept I

Marking OpenCL function as __noinline__

Jump to solution

Hey there,

Is there a way to make real function call using GCN 1.2 and OpenCL ?

What I am having is an OpenCL app with heavy register spilling (lets just assume that splitting to multi kernels is not an option). Is there an option to mark a function as noinline, so when a function call is made, all the register used by the current function will be stored to the global memory and the new function will have all the registers for itself - thus if the function is not being called/used there will be no drawback from having it (it will not impact the register spilling in the function, from where it is being called). We are doing similiar stuff for other platforms where this seems to help us improve the performance.

If there is a way, what is the OpenCL keyword to do so ?

Thanks.

p.s. if there is a way to do funciton call, what abour recursion ?

0 Kudos
Reply
1 Solution

Accepted Solutions
Highlighted
Miniboss
Miniboss

Re: Marking OpenCL function as __noinline__

Jump to solution

It's kinda impossible to isolate an inlined function in the binary. It's already mixed and interleaved with other code, so you can't just cut it out.

All you can do is to choose a 'platform' that supports function calls.

- I don't know much about HSA, but I think I've read that it supports function pointers, so there should be callable functions too.

- The other way where function calling works 100% is GCN ASM. You can query/set the program counter, so you can call any address in gpu memory. Even you can write self modifying programs, just like on an unprotected x86. The downside is that you must rewrite the whole thing from scratch, and need to reverse engineer, how the current driver sends the parameters to the kernel.

View solution in original post

5 Replies
Highlighted
Elite
Elite

Re: Marking OpenCL function as __noinline__

Jump to solution

There's a fairly weak connection between inlining and overspilling.

Anyway, I agree this should exist mostly as a way to control ISA size (plus: there's loop unroll and loop don't unroll options).

Issue: should this be considered per-function or per-call?

0 Kudos
Reply
Highlighted
Adept I
Adept I

Re: Marking OpenCL function as __noinline__

Jump to solution

I think it should be per function.

Still any ideas how this could be made (if it is possible at all) will be greatly appreciated.

0 Kudos
Reply
Highlighted
Adept I
Adept I

Re: Marking OpenCL function as __noinline__

Jump to solution

I am bumping this a bit.

A really want to do a real function call.

Is there any way to do it ? Modifying intermediate representations or other hacks ?

Doesn't matter how complicated it is, is there any way to do it ?

Highlighted
Staff
Staff

Re: Marking OpenCL function as __noinline__

Jump to solution

AFAIK, on AMD platforms, all function calls in OpenCL are inlined by default and currently, there is no other way to do it.

Regards,

0 Kudos
Reply
Highlighted
Miniboss
Miniboss

Re: Marking OpenCL function as __noinline__

Jump to solution

It's kinda impossible to isolate an inlined function in the binary. It's already mixed and interleaved with other code, so you can't just cut it out.

All you can do is to choose a 'platform' that supports function calls.

- I don't know much about HSA, but I think I've read that it supports function pointers, so there should be callable functions too.

- The other way where function calling works 100% is GCN ASM. You can query/set the program counter, so you can call any address in gpu memory. Even you can write self modifying programs, just like on an unprotected x86. The downside is that you must rewrite the whole thing from scratch, and need to reverse engineer, how the current driver sends the parameters to the kernel.

View solution in original post