2 Replies Latest reply on Jun 6, 2011 2:17 PM by rick.weber

    GPU for Subset sum problem

    mgrouch
      Subset sum problem

      I'm absolutely new to GPU programming so I apologize if my question is obvious.

      Can GPU and AMD java library for GPU be used to solve Subset sum problem

      http://en.wikipedia.org/wiki/Subset_sum_problem ?

      Is this the type of task that might promise better results than few CPUs?

      Thanks,

      --MG

       

        • GPU for Subset sum problem
          maximmoroz

           

          Originally posted by: mgrouch I'm absolutely new to GPU programming so I apologize if my question is obvious.

          Can GPU and AMD java library for GPU be used to solve Subset sum problem

          http://en.wikipedia.org/wiki/Subset_sum_problem ?

          Is this the type of task that might promise better results than few CPUs?

          Thanks,

          --MG

          Yes.

          Exponential time algorithm: Each work-item will correspond to the single combination (or a bunch of combinations in case the set is large). But there is memory limitation: you should be able to store 2*2^(N/2) elements in the device memory.

          Pseudo-polynomial time dynamic programming solution: N+P (N is the sum of the negative values and P is the sum of the positive values) should be big enough (at least several hundreds) to efficiently utilize GPU. Each dynamic programming pass corresponds to single kernel run.

            • GPU for Subset sum problem
              rick.weber

              I'm also interested in this problem for biology applications. My first pass algorithm is to create a canonical representation of all the solutions (which is possible because NP-complete problems are decidable). Then each thread takes a range of numbers, decodes them into a solution based on the canonical representation, evaluates them, and shoves them in a solutions list. This method is good because you never explicitly store candidate solutions unless they indeed solve the subset sum problem, saving you an exponential amount of memory. You only store actual solutions, which should usually be a virtually infinitesimal fraction of the candidates.

              The problem with this arises when you want to try to run problems with more than 4 billion candidate solutions, as you'll need larger than 32-bit integers, but I'm not sure how long such a kernel would take to run anyways (possibly days).