Issue 230712.4: Vector Segments Mapped to Registers

Author:	Ron Brender
Champion:	Ron Brender
Date submitted:	2023-07-12
Date revised:
Date closed:
Type:	Concept
Status:	Open
DWARF Version:	6

Part IV: Vector Segments Mapped to Registers

Introduction

Groups of vector registers that hold multiple values are an important parallel processing paradigm for increasing performance. In this paradigm, segments of an in-memory array are serially loaded into high performance CPU vector registers, operated upon possibly in combination with segments from other arrays, then returned to memory.

DWARF has no way to describe this paradigm.

Concept Overview

Introduce a new DW_TAG_object_map DIE. One or more DW_TAG_object_map DIEs may occur as children of an object having an array type. Each object map DIE corresponds to a region of code where successive chunks of the array object are processed in specialized (vector) registers.

An object map DIE necessarily has the following attributes:

DW_AT_mem_location, which gives the zero origin base of the array. This is typically a fixed address or a DWARF expression based on a loop iteration control variable or some logical equivalent.
DW_AT_ctrl_ordinal, which specifies ordinal number, the ordinal of the chunk of the array that is currently being processed. This is typically a DWARF expression based on a loop iteration control variable or some logical equivalent.
DW_AT_vec_location, whose value gives the register location of the chunks currently being processed. If more than one register are being used, this specifies the first.
DW_AT_element_size, whose value gives the size in bytes of each element of the array.
DW_AT_element_count, whose value gives the number of elements in a chunk.
Either a DW_AT_low_pc and DW_AT_high_pc pair of attributes or a DW_AT_ranges attribute.

The product of element size and element count divided by the byte size of the vector register location gives the number of vector registers that make up a chunk.

No other attributes are defined for DW_TAG_object_map.

Briefly, this information is used to index into an array as follows:

Let C be the value of the value of DW_AT_map_chunk attribute.
If C equals 0 or is not defined (outside the PC range defined by the DW_AT_low_pc/high_pc/ranges attributes), the element is in memory with an address given by the base of the array plus the offset.
Compute the offset from the base of the array to the array element in the normal manner, without regard for possible use of vector registers.
Divide this offset by the size of the chunk, giving a quotient Q and remainder R.
If Q is not equal to C, then the desired element is not in the current chunk, rather it is in memory with an address given by the base of the array plus the offset.
Otherwise, the remainder R indicates the part of the chunk corresponding to the desired element. This is converted to a register segment name (using ABI-specific knowledge of register sizes).

To facilitate use the object map information, the operator DW_OP_object_map is defined as follows:

DW_OP_object_map takes two operands: the first is (zero origin) ordinal index of the element to be accessed which is popped from the top of stack. The second is a reference to the DW_OP_object_map DIE that contains the remaining information. The result is the address of the element if the element is not contained in the currently active chuck, otherwise, it is a register segment name for the location of the element in the currently active chunk.

Example

Consider the example presented in Issues 211206.1 and 211206.2. In particular, the assembler code referenced in the following DWARF appears as proposed Figure D.74 in Issue 211206.1. (Unfortunately this example is split across two issues, which does make some details hard to follow. Hopefully diligence will be rewarded.)

According to this proposal, the parameter dst would be declared as shown in the following.

1$: DW_TAG_formal_parameter
    D_AT_name(“dst”)
    DW_AT_type(reference to .type.arr)    ! See 211206.1
    DW_AT_location(.loclist.dst)
2$:     DW_TAG_object_map
    DW_AT_mem_location(DW_OP_breg r0, 0)
    DW_AT_cntl_ordinal(DW_OP_bregx r3,0)
    DW_AT_vec_location(DW_OP_reg_name v0)
    DW_AT_element_size(DW_OP_lit4)
    DW_AT_element_count(DW_OP_lit8)
    DW_AT_low_pc(ref to .I1.1)
    DW_AT_high_pc(ref to .I2)

...

.loclist.dst:
range [.I0, .I1.3)
    DW_OP_bregx r0, 0
range [.I1.3, .I2)
    DW_OP_lit0                             ! base offset
    DW_OP_object_map(reference to 2$)
range [.I2, .I4)
    DW_OP_bregx r0, 0

...

Modified Proposed Figure D.75 Regarding “dst”