AnsweredAssumed Answered

Hamming Distances for 100'000*100'1000 16 bit vectors

Question asked by mercator on Apr 3, 2015
Latest reply on Apr 6, 2015 by mercator

Good morning,

I am a total newbiew to this forum and admittedly to OpenCL. Nevertheless, I have a little challenge to solve: I have roughly 100'000 bit-vectors (each 16 bytes - actually it is 128 byte 0/1 ASCII at the moment but converting that is not an issue) and I would like to calculate the Hamming Distance between all of them. Actually, it would already be (more than) sufficient to write a program that takes a single 128 bit vector and caculates the hamming distance between this vector and and individual 100'000 128 bit vectors, return 100'000 integers containing the distances.

Note the Hamming Distance between two vectors is the number of differing bits, e.g.. if you have

v1:                  00000000001000001111

v2:                  00000000000000001111

the Hamming Distance would be 1 as one bit differs. The differing bits are actually easy to calculate: v1 XOR v2, i.e. in the above example

v1 XOR v2:     00000000001000000000

Roughly speaking, that would make it roughly 5'000'0000'0000 comparisons. I think that cannot be done with reasonable effort in plain C. When I look at the speed of the Mandelbrot programs when using OpenCL/GL, I was wondering if it were difficult to implement in Open CL and could anyone give me a rough indication who long would take someone being "fluent" in C/C++ to get into OpenCL?

Thanks a lot for any support and have some great Easter days (off),