cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

Tasp
Journeyman III

Stripped kernel leads to std::out_of_range exception

Hello,

I've got an ATI RV770 and following these instructions:

http://developer.amd.com/support/KnowledgeBase/Lists/KnowledgeBase/DispForm.aspx?ID=115

leads to

terminate called after throwing an instance of 'std:: out_of_range'
  what():  basic_string::substr
Aborted

Unstripped kernel works, machine code is: 0x3ee (GPU), 0x7d2 (CPU)

Stripped kernel for CPU works too.

*edit*

If I remove the '-R .amdil' it works so I guess this section is actually importat?!!

The other sections (.source, .llvmir, .amdil) are not used during normal application execution and it is recommended that you remove them before distributing your application.


0 Likes
6 Replies
genaganna
Journeyman III

Originally posted by: Tasp Hello,

 

I've got an ATI RV770 and following these instructions:

 

http://developer.amd.com/support/KnowledgeBase/Lists/KnowledgeBase/DispForm.aspx?ID=115

 

leads to

 

terminate called after throwing an instance of 'std:: out_of_range'   what():  basic_string::substr Aborted

 

Unstripped kernel works, machine code is: 0x3ee (GPU), 0x7d2 (CPU)

 

Stripped kernel for CPU works too.

 

*edit*

 

If I remove the '-R .amdil' it works so I guess this section is actually importat?!!

 

The other sections (.source, .llvmir, .amdil) are not used during normal application execution and it is recommended that you remove them before distributing your application.

 



When are you getting that error? Is it solved?  if not please give exact steps to reproduce the problem.

0 Likes

More details:

ATI Stream SDK 2.2 with current Catalyst driver
HD4850 (RV770)
C++ bindings

I generate binary kernels as explained in the article. Only difference is I don't use CL_CONTEXT_OFFLINE_DEVICES_AMD because I only want the kernel for my device.

Then I strip the binary kernel:
objcopy -I elf32-i386 -O elf32-i386 -R .source -R .llvmir -R .amdil --alt-machine-code= kernel.bin stripped_kernel.bin

The kernel I used for testing is an empty kernel (I simplified the kernel when looking for the error)

__kernel foo() { }


GPU:
When I call enqueueNDRangeKernel I get the exception: 'std:: out_of_range'

If I remove the '-R .amdil' option when stripping the binary, enqueueNDRangeKernel works without problems.

CPU:
It works with or without the '-R .amdil' option.

0 Likes

I have the same problem. Mobility Radeon 4850, 0x3ee. The only difference is that the exception occurs when clEnqueueReadBuffer is called, not enqueueNDRangeKernel.

0 Likes

when i started to use binary kernels i had exactly the same problem. i also found out that when not using -R .amdil it still works.

but then i did some changes to my kernel code and the problem disappeared. i think it was after adding the __attribute__((reqd_work_group_size (64, 1, 1))) line to all my _kernel functions.

this is an adequate workaround for me.

0 Likes

Originally posted by: quadboon when i started to use binary kernels i had exactly the same problem. i also found out that when not using -R .amdil it still works.

 

but then i did some changes to my kernel code and the problem disappeared. i think it was after adding the __attribute__((reqd_work_group_size (64, 1, 1))) line to all my _kernel functions.

 

this is an adequate workaround for me.

 

Could you please past your kernel code and local work group size you are sending to clEnqueueNDRangeKernel here?

0 Likes

the kernel code is ways to big and its closed source. however, this is how i run clEnqueueNDRangeKernel():


const size_t global_work_size[3] = { 64 * max_compute_units, 1, 1 };
const size_t local_work_size[3]  = { 64, 1, 1 };

...

err = clEnqueueNDRangeKernel (command_queue, kernel, 1, NULL, global_work_size, local_work_size, 0, NULL, NULL);

...

__kernel void __attribute__((reqd_work_group_size (64, 1, 1))) func1 (...)
{
...
}

0 Likes