cancel
Showing results for 
Search instead for 
Did you mean: 

Archives Discussions

vladimir_1
Adept II

HSA-HLC stable and enqueue_kernel

I have two questions:

     HSAIL-HLC-Stable generates

kernarg_u64 %__vqueue_pointer,
kernarg_u64 %__aqlwrap_pointer,

which I guess are used to pass queue information to the kernel. Is there a documentation/sample for them ?

0 Likes
1 Solution

Sure. In our test case we pass both the queue pointer and a completion signal to the kernel. We pass the completion signal because kernels don't currently have the ability to create signals. The host side structure used to 'pack' the kernel arguments looks like this:

.

struct dispatch_parms {

    hsa_queue_t* queue;

    has_signal_t   signal;

}

.

A kernel using the large profile would have the following signature:

prog kernel &__agent_dispatch_kernel(kernarg_u64 %queue, kernarg_u64 &signal) {

@__agent_dispatch_kernel_entry:

    // Load the queue pointer

    ld_kernarg_align(8)_width(all)_u64  $d0, [%queue];

    // Load the signal handle

    ld_global_sig64  $d1, [%signal];

    // Increment the queue's write index by the amount specified in

    // the $d2 register. Store the original write index value back into

    // the $d2 register.

    addqueuewriteindex_global_scar_u64 $d2, $d0, $d2;

    .

    . <Do other things to the queue: See section 11.3 of the HSAIL programming guide>

    .

    // Wait until the signal's ($d1) value is equal to the value in $d2

    // Store the returned value back into $d2

    signal_wait_eq_rlx_s64_sig64 $d2, $d1, $d2;

    .

    .<Do other things to the signal: See section 6.8 of the HSAIL programming guide>

    .

    ret;

}

View solution in original post

0 Likes
5 Replies
jedwards
Staff

The target for the updated compiler is the end of May, but it could occur sooner.


Look at this sample to see how to en-queue a kernel in the HSA runtime: CLOC/examples/hsa/vector_copy at master · HSAFoundation/CLOC · GitHub

Hi,

Thanks for the ETA.

As for the sample unfortunately it has two problems. First - it still uses provisional API.

Second - it does not demonstrate enqueue_kernel call inside the kernel and does not explain how vqueue_pointer and aqlwrap_pointer in the kernel parameters should be filled out as it simple zeroes them out:

#ifdef DUMMY_ARGS
   //This flags should be set if HSA_HLC_Stable is used
   // This is because the high level compiler generates 6 extra args
  kernel_arg_start_offset += sizeof(uint64_t) * 6;
   printf("Using dummy args \n");
#endif
0 Likes

I understand now. You wanted a sample that shows how to actually performs agent dispatch. We currently do not have an updated sample that shows how to do that, although we have tested it. The test is not available to publish until it goes through our legal process.

0 Likes
vladimir_1
Adept II

Hmm, but at least is it possible to post what is supposed to go into kernarg_u64 %__vqueue_pointer and kernarg_u64 %__aqlwrap_pointer?

0 Likes

Sure. In our test case we pass both the queue pointer and a completion signal to the kernel. We pass the completion signal because kernels don't currently have the ability to create signals. The host side structure used to 'pack' the kernel arguments looks like this:

.

struct dispatch_parms {

    hsa_queue_t* queue;

    has_signal_t   signal;

}

.

A kernel using the large profile would have the following signature:

prog kernel &__agent_dispatch_kernel(kernarg_u64 %queue, kernarg_u64 &signal) {

@__agent_dispatch_kernel_entry:

    // Load the queue pointer

    ld_kernarg_align(8)_width(all)_u64  $d0, [%queue];

    // Load the signal handle

    ld_global_sig64  $d1, [%signal];

    // Increment the queue's write index by the amount specified in

    // the $d2 register. Store the original write index value back into

    // the $d2 register.

    addqueuewriteindex_global_scar_u64 $d2, $d0, $d2;

    .

    . <Do other things to the queue: See section 11.3 of the HSAIL programming guide>

    .

    // Wait until the signal's ($d1) value is equal to the value in $d2

    // Store the returned value back into $d2

    signal_wait_eq_rlx_s64_sig64 $d2, $d1, $d2;

    .

    .<Do other things to the signal: See section 6.8 of the HSAIL programming guide>

    .

    ret;

}

0 Likes