2 Replies Latest reply on Oct 27, 2008 11:26 PM by foxx1337

    possible bug with int math on the GPU inside cpu loop

    foxx1337
      works with BRT_RUNTIME=cpu

      Hello again,

      I've implemented a naive 3D Game of Life simulation with Brook+. I've noticed for some 32x32x32 initial configurations that after 6 generations 1 or 2 elements are wrong.

      After 5 generations the resulting configuration is correct. Also, if I start the simulation again, using the previous 5 generation output as input and running for 1 generation, I obtain a correct result, as opposed to running for 6 generations in a row from the start.

      When running with BRT_RUNTIME=cpu, the simulation always renders right results. With other grid sizes (like 10x10x10) the Brook+ simulation eventually renders wrong results.

       

      Possible fishy spots of my implementation:

      - I'm using 3D streams of int
      - I'm using a 3D input matrix of int for the main kernel
      - on the CPU I have a for looping for each generation, which calls the kernel
      - all math inside the kernel is with int and int3
      - kernel is O(n^3)
      - have a few if blocks inside the kernel

       

      Shouldn't the results from BRT_RUNTIME=cpu be identical to the results with BRT_RUNTIME=cal ?

      My implementation doesn't need any synchronized access to resources. In a for loop, it calls the kernel with input A and output B, then it calls the kernel again with input B and output A (maybe there's a race there since the GPU is still performing tasks from the first kernel call when the second occurs?).

       

       

      2008 server x64
      4850
      catalyst 8.10
      brook+ and cal 1.2.1 beta x64
      visual studio 2008 sp1, release x64