Cannot find 64bit length function at link time on the CPU

Discussion created by toastedcrumpets on May 27, 2010
Latest reply on Jun 1, 2010 by omkaranathan

Hello again,

I've got a kernel that compiles fine but when it comes to actually use the kernel I get an error

symbol lookup error: : /tmp/

undefined symbol: __length_4f64

I'm running this on the 2.1 stream SDK, and on the CPU, with both of the pragma's defined

#pragma OPENCL EXTENSION cl_khr_fp64 : enable

#pragma OPENCL EXTENSION cl_amd_fp64 : enable

Is the function not supported yet? I thought from knowledge base article KB88 it is supported on the CPU.


The following set of double-precision floating point functionality is supported  in OpenCL™ C kernels for x86 CPUs

The following geometric functions: minmaxclampdegreesradiansstepsmoothstepsigndot, length.

Edited: Added a minimal example code


#include <iostream> //The OpenCL C++ bindings, with exceptions #define __CL_ENABLE_EXCEPTIONS #include <CL/cl.hpp> std::string KernelSrc = "#pragma OPENCL EXTENSION cl_khr_fp64 : enable\n" "#pragma OPENCL EXTENSION cl_amd_fp64 : enable\n" "__kernel void BadKernel(__global double4* Input, __global double* Output)\n" "{ Output[get_global_id(0)] = length(Input[get_global_id(0)]); }"; int main(int argc, char * argv[]) { try { std::vector<cl::Platform> platforms; cl::Platform::get(&platforms); std::vector<cl::Device> devices; platforms[0].getDevices(CL_DEVICE_TYPE_CPU, &devices); devices.resize(1); std::cout << "Using device " << devices[0].getInfo<CL_DEVICE_NAME>() << std::endl; cl::Context context(devices); cl::Program::Sources source(1, std::pair<const char *, size_t>(KernelSrc.c_str(), KernelSrc.size())); cl::Program program(context, source); try {; } catch (cl::Error& err) { std::cerr << "Building failed, " << err.what() << "(" << err.err() << ")\n" << program.getBuildInfo<CL_PROGRAM_BUILD_LOG>(devices[0]) << "\n"; return -1; } cl::CommandQueue CmdQ(context, devices[0]); const size_t WrkGrpSize = 64; const size_t N = WrkGrpSize * 10; cl::Buffer InputBuffer(context, CL_MEM_ALLOC_HOST_PTR, sizeof(cl_double4) * N); cl_double4* Input = (cl_double4*)CmdQ.enqueueMapBuffer(InputBuffer, true, CL_MAP_WRITE, 0, N * sizeof(cl_double4)); for (size_t index(0); index < N; ++index) for (size_t dim(0); dim < 4; ++dim) Input[index].s[dim] = dim; CmdQ.enqueueUnmapMemObject(InputBuffer, (void*)Input); cl::Buffer OutputBuffer(context, CL_MEM_ALLOC_HOST_PTR, sizeof(cl_double) * N); cl::Kernel BadKernel = cl::Kernel(program, "BadKernel"); cl::KernelFunctor BadKernelFunc(BadKernel.bind(CmdQ, cl::NDRange(N), cl::NDRange(WrkGrpSize))); /*************** Launch the kernel ****************/ //This causes a linker error BadKernelFunc(InputBuffer, OutputBuffer).wait(); std::cout << "Finished without error!\n"; } catch (cl::Error& err) { std::cerr << "An OpenCL error occured, " << err.what() << "\nError num of " << err.err() << "\n"; return -1; } }