AnsweredAssumed Answered

AMD OpenCL - compiler segmentation fault

Question asked by sk7041 on Feb 8, 2013
Latest reply on Mar 26, 2013 by sk7041

Hello All,

 

I have recently been testing my OpenCL code on an AMD HD7970 GPU, and some of my kernels are causing the compiler to crash with a segmentation fault at clBuildProgram() . I would like to mention that the kernels compile and run fine on any NVIDIA device, and on CPU with AMD SDK.

 

Information about my system:

 

Description:  Debian GNU/Linux 6.0.1 (squeeze)

Arch: x86_64

CPU: AMD Athlon(tm) 64 X2 Dual Core Processor 4200+

GPU: AMD HD7970

AMD OpenCL SDK: 2.8

Driver: AMD Catalyst proprietary driver 12.10

 

Here are the back traces given by GDB for the 2 kernels that produce the segmentation fault. I would like to mention that these are ~400 lines long kernels, with many nested 'for' loops, and requiring a fairly large amount of private memory.

 

back trace for kernel 1:

 

Program received signal SIGSEGV, Segmentation fault.

0x00007ffff49a1b88 in SCRegSpill::CreateReload(SCInst*, int, SCInst*, SCBlock*, bitset*, bitset*, int) () from /usr/lib/libamdocl64.so

(gdb) bt

#0  0x00007ffff49a1b88 in SCRegSpill::CreateReload(SCInst*, int, SCInst*, SCBlock*, bitset*, bitset*, int) () from /usr/lib/libamdocl64.so

#1  0x00007ffff49b2533 in SCRegSpill::Spill() () from /usr/lib/libamdocl64.so

#2  0x00007ffff49b5160 in SCRegAlloc::Allocate(bool) () from /usr/lib/libamdocl64.so

#3  0x00007ffff49b54af in SCRegAlloc::AllocateRegisters() () from /usr/lib/libamdocl64.so

#4  0x00007ffff45b0b5f in CompilerBase::GenerateCodeUsingNewIR(void*, bool) () from /usr/lib/libamdocl64.so

#5  0x00007ffff45b6764 in Compiler::Compile(ILProgram*) () from /usr/lib/libamdocl64.so

#6  0x00007ffff45b6ee0 in Compiler::CompileShader(unsigned char*, unsigned char*, unsigned int const*, CompilerExternal*) ()

   from /usr/lib/libamdocl64.so

#7  0x00007ffff45b3227 in CompilerExternal::CompileShader(_SC_SRCSHADER const*, _SC_HWSHADER*) () from /usr/lib/libamdocl64.so

#8  0x00007ffff49cffc2 in scWrapCompileBinarySI(void*, unsigned int, void**, unsigned int*, unsigned int, unsigned int, scWrapOptionEnum*)

    () from /usr/lib/libamdocl64.so

#9  0x00007ffff458df6b in amuCompCompile () from /usr/lib/libamdocl64.so

#10 0x00007ffff458ecee in ddiCompile () from /usr/lib/libamdocl64.so

#11 0x00007ffff44cb91e in gpu::NullKernel::create(stlp_std::basic_string<char, stlp_std::char_traits<char>, stlp_std::allocator<char> > const&, stlp_std::basic_string<char, stlp_std::char_traits<char>, stlp_std::allocator<char> > const&, void const*, unsigned long) ()

   from /usr/lib/libamdocl64.so

#12 0x00007ffff44d05d3 in gpu::Kernel::create(stlp_std::basic_string<char, stlp_std::char_traits<char>, stlp_std::allocator<char> > const&, stlp_std::basic_string<char, stlp_std::char_traits<char>, stlp_std::allocator<char> > const&, void const*, unsigned long) ()

   from /usr/lib/libamdocl64.so

#13 0x00007ffff44df058 in gpu::Program::createKernel(stlp_std::basic_string<char, stlp_std::char_traits<char>, stlp_std::allocator<char> > const&, gpu::Kernel::InitData const*, stlp_std::basic_string<char, stlp_std::char_traits<char>, stlp_std::allocator<char> > const&, stlp_std::basic_string<char, stlp_std::char_traits<char>, stlp_std::allocator<char> > const&, bool*, void const*, unsigned long) ()

   from /usr/lib/libamdocl64.so

#14 0x00007ffff44de2ca in gpu::NullProgram::linkImpl(amd::option::Options*) () from /usr/lib/libamdocl64.so

#15 0x00007ffff4479055 in device::Program::build(stlp_std::basic_string<char, stlp_std::char_traits<char>, stlp_std::allocator<char> > const&, char const*, amd::option::Options*) () from /usr/lib/libamdocl64.so

#16 0x00007ffff4489030 in amd::Program::build(stlp_std::vector<amd::Device*, stlp_std::allocator<amd::Device*> > const&, char const*, void (*)(_cl_program*, void*), void*, bool) () from /usr/lib/libamdocl64.so

#17 0x00007ffff4466ff3 in clBuildProgram () from /usr/lib/libamdocl64.so

 

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

 

back trace for kernel 2:

 

Program received signal SIGSEGV, Segmentation fault.

0x00007ffff4936199 in SC_SCCGCM::GetEarly(SCInst*) () from /usr/lib/libamdocl64.so

(gdb) bt

#0  0x00007ffff4936199 in SC_SCCGCM::GetEarly(SCInst*) () from /usr/lib/libamdocl64.so

#1  0x00007ffff49363a4 in SC_SCCGCM::ComputeEarlyPosition(SCInst*, FuncRegion*) () from /usr/lib/libamdocl64.so

#2  0x00007ffff49c5848 in SC_SCCGVN::GVNSCCInst(SCInst*, SC_SCCVN*) () from /usr/lib/libamdocl64.so

#3  0x00007ffff49c7704 in SCCVNBase<SCInst, SC_CurrentValue>::VNSCCInst(SCInst*) () from /usr/lib/libamdocl64.so

#4  0x00007ffff49c6ff5 in SC_SCCBLK::VNSCCItem(int) () from /usr/lib/libamdocl64.so

#5  0x00007ffff49c7a97 in void SCCVNBase<SCInst, SC_CurrentValue>::ProcessSCC<SC_SCCBLK>(SC_SCCBLK*, int) () from /usr/lib/libamdocl64.so

#6  0x00007ffff4938a5f in SCC_BASE<SCBlock>::SCC(SCBlock*) () from /usr/lib/libamdocl64.so

#7  0x00007ffff49c69ad in SC_SCCBLK::Traversal() () from /usr/lib/libamdocl64.so

#8  0x00007ffff45b07d3 in CompilerBase::GenerateCodeUsingNewIR(void*, bool) () from /usr/lib/libamdocl64.so

#9  0x00007ffff45b6764 in Compiler::Compile(ILProgram*) () from /usr/lib/libamdocl64.so

#10 0x00007ffff45b6ee0 in Compiler::CompileShader(unsigned char*, unsigned char*, unsigned int const*, CompilerExternal*) ()

   from /usr/lib/libamdocl64.so

#11 0x00007ffff45b3227 in CompilerExternal::CompileShader(_SC_SRCSHADER const*, _SC_HWSHADER*) () from /usr/lib/libamdocl64.so

#12 0x00007ffff49cffc2 in scWrapCompileBinarySI(void*, unsigned int, void**, unsigned int*, unsigned int, unsigned int, scWrapOptionEnum*)

    () from /usr/lib/libamdocl64.so

#13 0x00007ffff458df6b in amuCompCompile () from /usr/lib/libamdocl64.so

#14 0x00007ffff458ecee in ddiCompile () from /usr/lib/libamdocl64.so

#15 0x00007ffff44cb91e in gpu::NullKernel::create(stlp_std::basic_string<char, stlp_std::char_traits<char>, stlp_std::allocator<char> > const&, stlp_std::basic_string<char, stlp_std::char_traits<char>, stlp_std::allocator<char> > const&, void const*, unsigned long) ()

   from /usr/lib/libamdocl64.so

#16 0x00007ffff44d05d3 in gpu::Kernel::create(stlp_std::basic_string<char, stlp_std::char_traits<char>, stlp_std::allocator<char> > const&, stlp_std::basic_string<char, stlp_std::char_traits<char>, stlp_std::allocator<char> > const&, void const*, unsigned long) ()

   from /usr/lib/libamdocl64.so

#17 0x00007ffff44df058 in gpu::Program::createKernel(stlp_std::basic_string<char, stlp_std::char_traits<char>, stlp_std::allocator<char> > const&, gpu::Kernel::InitData const*, stlp_std::basic_string<char, stlp_std::char_traits<char>, stlp_std::allocator<char> > const&, stlp_std::basic_string<char, stlp_std::char_traits<char>, stlp_std::allocator<char> > const&, bool*, void const*, unsigned long) ()

   from /usr/lib/libamdocl64.so

#18 0x00007ffff44de2ca in gpu::NullProgram::linkImpl(amd::option::Options*) () from /usr/lib/libamdocl64.so

#19 0x00007ffff4479055 in device::Program::build(stlp_std::basic_string<char, stlp_std::char_traits<char>, stlp_std::allocator<char> > const&, char const*, amd::option::Options*) () from /usr/lib/libamdocl64.so

#20 0x00007ffff4489030 in amd::Program::build(stlp_std::vector<amd::Device*, stlp_std::allocator<amd::Device*> > const&, char const*, void (*)(_cl_program*, void*), void*, bool) () from /usr/lib/libamdocl64.so

#21 0x00007ffff4466ff3 in clBuildProgram () from /usr/lib/libamdocl64.so

 

The kernel code is proprietary code so I cannot post it on this forum, but I accept sending it to the AMD compiler dev team if need be. Please get in touch if you would like me to do so.

 

Regards,

 

Simon

Outcomes