5 Replies Latest reply on Sep 14, 2010 3:47 AM by himanshu.gautam

    ATI stream profiler vs. custom profiler problem

    crazybee
      ATI stream profiler vs. custom profiler problem

      Hello.

      I'm having some problems with the ATI stream profiler and my own custom profiler.

      The code I am working on is the Matrix Multiplication sample from the ATI stream SDK. I have inserted my own profiler into this code (one that uses clGetEventProfilingInfo() to retreive the start and end time of kernel execution, the ATI documentation says that the ATI stream profiler works in the same manner), and compared it's results with those of the ATI stream profiler. The results of this comparison are somewhat unexpected. As it turns out, when executing the code through the ATI stream profiler the results are almost the same (difference is at most 0,1 ms). However while executing the code normally in Release mode (not though the ATI profiler) the output of my profiler has much hihger values (e.g. 79 ms total executuion time of all kernels through the ATI stream profiler, and 375 ms total execution time in Release mode). 

      What can be the cause of this difference? Why is it so big?

      I've checked that both executions are carried out on the same device (GPU), the CL_DEVICE_PROFILING_TIMER_RESOLUTION parameter is also unchanged.

      The profiler code is given below.

      // HEADER #ifndef MY_PROFILER #define MY_PROFILER #include <cstdio> #include <string> #include <CL/cl.h> namespace profiler { class Profiler { public: Profiler(bool console = true, bool file = true); ~Profiler(); bool initialize(const char* filename); void deinitialize(); bool saveEventProfileInfo(cl_event event); bool writeOutput(std::string str); private: cl_double totalTime; FILE* file; bool consoleOutput; bool fileOutput; }; } #endif // .CPP FILE #include <iostream> #include <ctime> #include "Profiler.h" namespace profiler { using namespace std; Profiler::Profiler(bool console, bool file) :consoleOutput(console), fileOutput(file), file(NULL), totalTime(0) { } Profiler::~Profiler() { } bool Profiler::initialize(const char* filename) { if (fileOutput) { if (file) { fclose(file); file = NULL; } this->file = fopen(filename, "a"); if (!file) { cout << "Profiler::initialize Failed to open output file" << endl; return false; } time_t rawtime; struct tm * timeinfo; time ( &rawtime ); timeinfo = localtime ( &rawtime ); fprintf ( file, "Initialize at: %s", asctime (timeinfo) ); delete timeinfo; } totalTime = 0; return true; } void Profiler::deinitialize() { if (fileOutput) { if (file) { fprintf(file, "total execution time [miliSecs]:\t%f\n\n", totalTime); fclose(file); file = NULL; } } if (consoleOutput) { cout << "total execution time: " << totalTime << endl; } } bool Profiler::saveEventProfileInfo(cl_event event) { cl_int status; cl_ulong queuedNanoSecs; cl_ulong submitNanoSecs; cl_ulong startNanoSecs; cl_ulong endNanoSecs; status = clGetEventProfilingInfo( event, CL_PROFILING_COMMAND_QUEUED, sizeof(cl_ulong), &queuedNanoSecs, NULL); if(status != CL_SUCCESS) { cout << "Profiler::initialize Failed to get event queued time" << endl; return false; } status = clGetEventProfilingInfo( event, CL_PROFILING_COMMAND_SUBMIT, sizeof(cl_ulong), &submitNanoSecs, NULL); if(status != CL_SUCCESS) { cout << "Profiler::initialize Failed to get event submit time" << endl; return false; } status = clGetEventProfilingInfo( event, CL_PROFILING_COMMAND_START, sizeof(cl_ulong), &startNanoSecs, NULL); if(status != CL_SUCCESS) { cout << "Profiler::initialize Failed to get event start time" << endl; return false; } status = clGetEventProfilingInfo( event, CL_PROFILING_COMMAND_END, sizeof(cl_ulong), &endNanoSecs, NULL); if(status != CL_SUCCESS) { cout << "Profiler::initialize Failed to get event end time" << endl; return false; } cl_double submitTimeMilis = ((cl_double)(submitNanoSecs - queuedNanoSecs))/1000000.0; cl_double startTimeMilis = ((cl_double)(startNanoSecs - submitNanoSecs))/1000000.0; cl_double execTimeMilis = ((cl_double)(endNanoSecs - startNanoSecs))/1000000.0; if (consoleOutput) { //cout << "event start [nanoSecs]: " << startNanoSecs << endl; //cout << "event end [nanoSecs]: " << endNanoSecs << endl; cout << "queued -> submit [miliSecs]: " << submitTimeMilis << endl; cout << "submit -> start [miliSecs]: " << startTimeMilis << endl; cout << "event execTime (start -> end) [miliSecs]: " << execTimeMilis << endl; } if (fileOutput) { //fprintf(file, "event start [nanoSecs]:\t%u\n", startNanoSecs); //fprintf(file, "event end [nanoSecs]:\t%u\n", endNanoSecs); fprintf(file, "queued -> submit [miliSecs]:\t%f\n", submitTimeMilis); fprintf(file, "submit -> start [miliSecs]:\t%f\n", startTimeMilis); fprintf(file, "event execTime [miliSecs]:\t%f\n", execTimeMilis); } totalTime += execTimeMilis; return true; } bool Profiler::writeOutput(string str) { if (fileOutput) { if (file) { fprintf(file, str.c_str()); } else { cout << "Profiler::writeOutput Failed to write to file" << endl; return false; } } if (consoleOutput) { cout << str; } return true; } }