While I was studying Andrew NG machine learning course I have implemented a prediction application.
It has support for multithread and takes lots of parameter. Since I have sapphire toxic r9 280x I have also implemented OpenCL version here:
Most of code I have copied from AMD examples and created kernel source from multithreaded version of application. I wouldnt say perfectly coded as its my one of first C application but works faster than CPU version for small datasets (i.e. input has 400 column and 5000 row) I will be glad if you can share your feedback so I can improve things there. I have commited some validation exception handling future scaling support for CPU version but first I wanted to see your comments if its worth it to give support OpenCL version.
Just wanted to let you know you aren't being ignored. I'm leaving this one for the community to answer. Someone might jump in and provide guidance.