Hi,

could anyone help me? I have a matrix-vector multiplication program written in openCL/C. I call this function from Fortran to do a matrix vector multiplication. eg Ax=b. The A matrix does not change, however x is updated on successive calls.

How can I reuse A on sucessive calls without reinitialising and copying A to the GPU? Because this takes up most of the execution time.

salomonamd,i did matrix multiplication using local memory from this code.

A and B are input matrices C is the output.

dimension of both matices is same.i.e widthA=widthB=widthC.

You can refer to matrix multiplication sample which came with SDK.

Although local memory is not used,but the arrays are copied to device memory,which add up to efficiency.

hope it helps.