could anyone help me? I have a matrix-vector multiplication program written in openCL/C. I call this function from Fortran to do a matrix vector multiplication. eg Ax=b. The A matrix does not change, however x is updated on successive calls.

How can I reuse A on sucessive calls without reinitialising and copying A to the GPU? Because this takes up most of the execution time.

salomonamd,i did matrix multiplication using local memory from this code.

A and B are input matrices C is the output.

dimension of both matices is same.i.e widthA=widthB=widthC.

You can refer to matrix multiplication sample which came with SDK.

Although local memory is not used,but the arrays are copied to device memory,which add up to efficiency.

