cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

niravshah00
Journeyman III

Multithreaded Brook+ algorithm from a nested for loop

Hi gaurav,

So you think I should change from brook+ to Open CL or CAL++.
As you know my requirements so do you think i can accomplish what i want in brook+ for should i switch.
I would want to finish this asap your help would mean a lot.

0 Likes
gaurav_garg
Adept I

Multithreaded Brook+ algorithm from a nested for loop

Sorry for delay in answer. I was busy with something else and was not checking my mails.

I am not sure if I understand your algorithm very well. It will be good if you can post your host algorithm.

You need to understand that you have to change your algorithm based on GPU architecture and limitations.

I would suggest you to first try a basic Brook+ implementation and then go for optimizations.

IIUC, you are doing something like this-

for a 1000:10000

for b 1000:10000

for c 1000:10000

for x 3:10

for y 3:10

for z 3:10

First you can try to write a kernel that encapsulates last 3 loops (for 'x', 'y', and 'z'). You can create a 2D stream for implicit loop on 'b' & 'c' (If there is size limitations then, you can do processing in tiles). And you can keep loop on 'a' on host side.

0 Likes
niravshah00
Journeyman III

Multithreaded Brook+ algorithm from a nested for loop

Hi,

Thanks for your reply.
I can send you my code that i have written in Java.
So far you understanding has been correct. My equation is A^x  + B^y = C^z

I am sloving for z . So there are basically 5 variables .Since the range has to be flexible i want to utilize the GPU to as much as i can .

In my lab i have a machine which has four Firestream 9170. (with dual quadcore processor)

Secondly I need to figure out by which i can send result i.e all 6 variables only if a z is within the range 10^-8  i dont want to scan the entire stream on the host .Since the only few of the threads would return results .

Let me know if you would want to see my java (serial) code

 

Thanks avery much

0 Likes
niravshah00
Journeyman III

Multithreaded Brook+ algorithm from a nested for loop

Any sugesstions on how can i return my results from kernel code to host code?

0 Likes
niravshah00
Journeyman III

Multithreaded Brook+ algorithm from a nested for loop

here is the Java code for my algorithm

import java.io.File;
import java.io.PrintStream;



public class FindPossibleCounterExamples
{

    /**
     * @param args
     */
    public static void main(String[] args)
    {
        FindPossibleCounterExamples finder = new FindPossibleCounterExamples();
        try
        {
            pOStream = new PrintStream(file);
        }
        catch(Exception e)
        {
            System.out.println(e.getMessage());
        }
        finder.findSuitableC();
    }
    private float BASE_MIN = 1000;
    private float BASE_MAX = 1006;
    private int POW_MAX = 10;
    private int POW_MIN = 3;
    private static final File file = new File("BealsPossibleCounterExamples.txt");
    private static PrintStream pOStream=null;

    private void findSuitableC()
    {

        for(float iA=BASE_MIN; iA<BASE_MAX; iA++)
        {
            for(float iB=BASE_MIN; iB<BASE_MAX; iB++)
            {
                if(iB>iA && gcd(iA,iB)>1.0)
                    continue;
                for(float iC =BASE_MIN; iC < BASE_MAX; iC++)
                {
                    // Beal says if A^X+B^Y = C^Z then A,B,C have a common prime factor.
                    // if the gcd is one for each, it means they dont have a common prime factor.
                    if(gcd(iB,iC)==1 && gcd(iC,iA)==1 && gcd(iA,iB)==1)
                    {
                        // for all C's that dont have a common factor with A and B,
                        // run through values of X,Y and find a value for Z.
                        findZ(iA,iB,iC);
                    }

                }
            }
        }
        pOStream.flush();
        pOStream.close();
        //oStream.close();

    }

    private void findZ(float A, float B, float C)
    {
        for(int X = POW_MIN; X<POW_MAX; X++)
        {
            for(int Y = POW_MIN; Y<POW_MAX; Y++)
            {
                double sum =  Math.pow((double)A, (double)X)+Math.pow((double)B, (double)Y);
                double Z = (Math.log((double)sum)/Math.log((double)C));
                double epsillon = 10E-4f;
                if(isWithinRange(Z,epsillon))
                {   
                    String toPrint = ""+A+"^"+X+" + "+B+"^"+Y+" = "+C+"^"+Z+"\n";
                    pOStream.append(toPrint);
                    System.out.print(toPrint);
                }
            }
        }
    }
    private double gcd(double x, double y)
    {
        if (y==0) return x;
        return gcd(y,x%y);
    }

    private boolean isWithinRange(double z, double epsillon)
    {
        double ceil = Math.ceil((double)z);
        //float floor = Math.floor((double)z);
        if((ceil - z)<=epsillon )
            return true;
        else
            return false;
    }

}

0 Likes