cancel
Showing results for 
Search instead for 
Did you mean: 

OpenCL

kaatish
Journeyman III

Indirect memory access read on GPU

Jump to solution

Hi,

I want to have a small table lookup (256 byte array) which would be the same for every work item. This table would be read very freq uently. Depending upon the data a workitem reads, a particular index of the table lookup must be read. Therefore, the access pattern is random and the compiler would not know at compile time as to which data is being read.

What is the effecient way of doing this? Would it work if this array is in texture memory so that it is cached?

Regards.

0 Kudos
Reply
1 Solution

Accepted Solutions
pesh
Adept I

Re: Indirect memory access read on GPU

Jump to solution

You can also try to use __constant memory (it is right memory type for your needs). This memory is cacheble and if you only use 256 bytes, then most likely you will have good cache hits and good performance. There is no bank conflict when read from same address too.

__local memory is good choice as well, but you will need to initialize it for each work-group where you can lose some time.

I think you need to implement both variants and choose one that has better performance.

View solution in original post

0 Kudos
Reply
4 Replies
Wenju
Elite

Re: Indirect memory access read on GPU

Jump to solution

Hi kaatish

I think use the local memory is the best way.

0 Kudos
Reply
kaatish
Journeyman III

Re: Indirect memory access read on GPU

Jump to solution

Hi Wenju,

I think the problem with local memory would be that it would be quite possible to have bank/memory conflicts when two workitems access the same word. Since there is no predictability in the pattern of access, I would not be able to address the problem of conflicts.

Does texture memory give good performance with random pattern access?

0 Kudos
Reply
Wenju
Elite

Re: Indirect memory access read on GPU

Jump to solution

Hi,kaatish

Don't worry about the local memory read,The bank/memory conflicts occur only on writing..

0 Kudos
Reply
pesh
Adept I

Re: Indirect memory access read on GPU

Jump to solution

You can also try to use __constant memory (it is right memory type for your needs). This memory is cacheble and if you only use 256 bytes, then most likely you will have good cache hits and good performance. There is no bank conflict when read from same address too.

__local memory is good choice as well, but you will need to initialize it for each work-group where you can lose some time.

I think you need to implement both variants and choose one that has better performance.

View solution in original post

0 Kudos
Reply