Issue 211206.2: Stack piece operators

Author: Markus Metzger
Champion: Markus Metzger
Date submitted: 2021-12-06
Date revised: 2023-12-06
Date closed:
Type: Enhancement
Status: Open
DWARF Version: 6

Section 2.6.12, pg 42

Scalar variables in implicitly vectorized code may be widened by the vectorization factor. Consider this vectorized OpenMP loop:

void vec_add (int dst[], int src[], int len) {
  #pragma omp simd
  for (int i = 0; i < len; ++i) {
      int tmp = dst[i] + src[i];
      dst[i] = tmp;
  }
}

Assuming a vectorization factor of 8, 8 array elements are being processed together. The local variable tmp is widened to a vector of 8 integers.

To map this vectorized code back to the scalar source code, we'd use DW_OP_push_lane in order to determine which element of the 8-wide tmp vector to describe.

For a memory location, we can simply add that value, scaled to the element size, to the address of the tmp object described by the memory location description to get a memory location description for that element.

For a register location, we would need to reference a piece of the register. The existing DW_OP_bit_piece operator requires the offset and size to be passed as arguments. That doesn't work in our case since we need to compute the offset.

In the general case, the object may be split between memory and registers or stored in a non-contiguous fashion.

To be able to desribe such locations, we propose new operators

DW_OP_piece_stack
DW_OP_bit_piece_stack
DW_OP_bit_piece_stack_offset

that extend the existing family of piece operators by variants that compute some of their operands.

Proposed Changes

Section 2.6.1.2, pg. 42.

Add

  1. DW_OP_piece_stack

The DW_OP_piece_stack operation works similar to DW_OP_piece except that it computes the size using a DWARF expression.

The DW_OP_piece_stack operation takes two operands. The first operand is a ULEB128 number that gives the size of the second operand in bytes. The second operand is a DWARF expression, which computes the piece size. The operation evaluates the DWARF expression on an empty DWARF stack, then pops the topmost entry off the stack and interprets it as an unsigned integer, which describes the size in bytes of the piece of the object referenced by the preceding simple location description. If the piece is located in a register, but does not occupy the entire register, the placement of the piece within that register is defined by the ABI.

  1. DW_OP_bit_piece_stack The DW_OP_bit_piece_stack operation works similar to DW_OP_bit_piece except that it computes the size and offset using a DWARF expression.

The DW_OP_bit_piece_stack operation takes two operands. The first operand is a ULEB128 number that gives the size of the second operand in bytes. The second operand is a DWARF expression, which computes the piece offset and size. The operation evaluates the DWARF expression on an empty DWARF stack, pops the topmost entry off the stack and interprets it as unsigned integer giving the size in bits of the piece. It then pops the next entry off the stack and interprets it as unsigned integer giving the offset in bits from the location defined by the preceding simple location description.

  1. DW_OP_bit_piece_stack_offset The DW_OP_bit_piece_stack_offset operation works similar to DW_OP_bit_piece except that it computes the offset using a DWARF expression.

The DW_OP_bit_piece_stack_offset operation takes three operands. The first operand is a ULEB128 number, which describes the size in bits of the piece. The second operand is a ULEB128 number that gives the size of the third operand in bytes. The third operand is a DWARF expression, which computes the piece offset. The operation evaluates the DWARF expression on an empty DWARF stack, then pops the topmost entry off the stack and interprets it as unsigned integer giving the offset in bits from the location defined by the preceding simple location description.

Section 7.7, p.223.

Add

DW_OP_piece_stack | TBD | 2 | ULEB128 size, block of that size DW_OP_bit_piece_stack | TBD | 2 | ULEB128 size, block of that size DW_OP_bit_piece_stack_offset | TBD | 3 | ULEB128 piece size, ULEB128 size, block of that size

to table 7.9.

Section D.17 (introduced in 211206.1)

Change figure D.73 to

void vec_add (int dst[], int src[], int len) {
  #pragma omp simd
  for (int i = 0; i < len; ++i) {
      int tmp = dst[i] + src[i];
      dst[i] = tmp;
  }
}

Add

DW_TAG_lexical_block
    ...
    DW_TAG_variable
        DW_AT_name "tmp"
        DW_AT_type int
        DW_AT_location .loclist.2
        ...

to figure D.75 at the end of the DW_TAG_subprogram DIE.

Add

.loclist.2: range [.l1.1, .l1.3) DW_OP_regx v0 DW_OP_bit_piece_stack_offset 32, 3, DW_OP_push_lane DW_OP_lit5 DW_OP_shl range [.l2.1, .l2.3) DW_OP_regx r5 end-of-list

to figure D.75 at the end of the loclist section.


2023-01-23: Revised: add Appendix text, revise op names.
2023-04-18: Revised. Fixed vectorization factor.
2023-09-11: Revised example.