Issue 230712.4: Vector Segments Mapped to Registers
Author: | Ron Brender |
---|---|
Champion: | Ron Brender |
Date submitted: | 2023-07-12 |
Date revised: | |
Date closed: | |
Type: | Concept |
Status: | Open |
DWARF Version: | 6 |
Part IV: Vector Segments Mapped to Registers
Introduction
Groups of vector registers that hold multiple values are an important parallel processing paradigm for increasing performance. In this paradigm, segments of an in-memory array are serially loaded into high performance CPU vector registers, operated upon possibly in combination with segments from other arrays, then returned to memory.
DWARF has no way to describe this paradigm.
Concept Overview
Introduce a new DW_TAG_object_map
DIE. One or more DW_TAG_object_map
DIEs may occur as children of an object having an array type. Each
object map DIE corresponds to a region of code where successive chunks
of the array object are processed in specialized (vector) registers.
An object map DIE necessarily has the following attributes:
-
DW_AT_mem_location
, which gives the zero origin base of the array. This is typically a fixed address or a DWARF expression based on a loop iteration control variable or some logical equivalent. -
DW_AT_ctrl_ordinal
, which specifies ordinal number, the ordinal of the chunk of the array that is currently being processed. This is typically a DWARF expression based on a loop iteration control variable or some logical equivalent. -
DW_AT_vec_location
, whose value gives the register location of the chunks currently being processed. If more than one register are being used, this specifies the first. -
DW_AT_element_size
, whose value gives the size in bytes of each element of the array. -
DW_AT_element_count
, whose value gives the number of elements in a chunk. -
Either a
DW_AT_low_pc
andDW_AT_high_pc
pair of attributes or aDW_AT_ranges
attribute.
The product of element size and element count divided by the byte size of the vector register location gives the number of vector registers that make up a chunk.
No other attributes are defined for DW_TAG_object_map
.
Briefly, this information is used to index into an array as follows:
-
Let C be the value of the value of
DW_AT_map_chunk
attribute. -
If C equals 0 or is not defined (outside the PC range defined by the
DW_AT_low_pc/high_pc/ranges
attributes), the element is in memory with an address given by the base of the array plus the offset. -
Compute the offset from the base of the array to the array element in the normal manner, without regard for possible use of vector registers.
-
Divide this offset by the size of the chunk, giving a quotient Q and remainder R.
-
If Q is not equal to C, then the desired element is not in the current chunk, rather it is in memory with an address given by the base of the array plus the offset.
-
Otherwise, the remainder R indicates the part of the chunk corresponding to the desired element. This is converted to a register segment name (using ABI-specific knowledge of register sizes).
To facilitate use the object map information, the operator
DW_OP_object_map
is defined as follows:
DW_OP_object_map
takes two operands: the first is (zero origin) ordinal
index of the element to be accessed which is popped from the top of
stack. The second is a reference to the DW_OP_object_map
DIE that
contains the remaining information. The result is the address of the
element if the element is not contained in the currently active chuck,
otherwise, it is a register segment name for the location of the element
in the currently active chunk.
Example
Consider the example presented in Issues 211206.1 and 211206.2. In particular, the assembler code referenced in the following DWARF appears as proposed Figure D.74 in Issue 211206.1. (Unfortunately this example is split across two issues, which does make some details hard to follow. Hopefully diligence will be rewarded.)
According to this proposal, the parameter dst would be declared as shown in the following.
1$: DW_TAG_formal_parameter
D_AT_name(“dst”)
DW_AT_type(reference to .type.arr) ! See 211206.1
DW_AT_location(.loclist.dst)
2$: DW_TAG_object_map
DW_AT_mem_location(DW_OP_breg r0, 0)
DW_AT_cntl_ordinal(DW_OP_bregx r3,0)
DW_AT_vec_location(DW_OP_reg_name v0)
DW_AT_element_size(DW_OP_lit4)
DW_AT_element_count(DW_OP_lit8)
DW_AT_low_pc(ref to .I1.1)
DW_AT_high_pc(ref to .I2)
...
.loclist.dst:
range [.I0, .I1.3)
DW_OP_bregx r0, 0
range [.I1.3, .I2)
DW_OP_lit0 ! base offset
DW_OP_object_map(reference to 2$)
range [.I2, .I4)
DW_OP_bregx r0, 0
...
Modified Proposed Figure D.75 Regarding “dst”