Issue 070519.1: Line table support for VLIW
Background:
At an architectural level, most processors have instructions which contain
a single instruction. If the hardware is able to issue multiple operations
at the same time, this is hidden from the user, so that there appears to be
a one-to-one correspondence between instructions and operations. The
hardware selects which operations from which instructions can be executed
at the same time, and the user has little or no control over which operations
will be executed at the same time.
VLIW processors reveal the hardware's ability to issue multiple operations
at the same time by packaging several operations into a single instruction.
The user (i.e., compiler) is responsible for selecting operations which
the hardware is to issue at the same time. There are a few examples of
VLIW systems: Intel Itanium, Xtensa LX2, TI C6000, STMicro ST200, NXP TriMedia.
In both conventional and VLIW processors, instructions are the smallest
unit which can be the target of a branch address. In conventional processors,
instructions may be fixed or variable length. On some processors, there are
multiple sizes of instructions, although each size is fixed (e.g. ARM/Thumb).
In VLIW processors, this is less common, although Xtensa LX2 does allow
processor configurations with different sized instructions.
The DWARF line table (Section 6.2) provides a mapping between instruction
addresses and the source line (and column) associated with the instrucion.
Using the line table, a DWARF consumer can determine the start address of
each instruction, the instruction length, and certain other characteristics
such as whether the instruction represents the start of a source statement,
end of the prologue, etc.
With one minor exception, there are no architectural dependencies in the
line table. This execption is the ISA value, used for processors which
support multiple instruction sets. This does not affect the mapping between
instruction address and source line and is only provided to inform a
disassembler which of the possible ISAs is in use at the location.
[The minimum_instruction_length is not an architectural dependency; it is
simply a multiplier used to generate a slightly more compact representation of
the line data. All conventional processors can be represented with this
value set to one, although a the price of slightly larger line data.]
This is not satisfactory for VLIW processors, since the multiple operations
in an instruction may be related to different source lines. A means is
needed to represent not just instructions, but the individual operations
within an instruction.
Existing compilers for Itanium extend the interpretation of the instruction
address in an architecture-specific fashion to represent the individual
operations in each instruction and introduces an implicit dependency between
the line table and the architecture. The address value no longer represents
the machine address, but is a composition of the address and a value
representing an index to the operation within the instruction that it refers to.
Proposal 070426.2 proposes to codify this as part of DWARF.
This method does not appear to be general and in particular, it does not
appear that it would work for the Xtensa LX2 processor line which does not
have a pre-defined format for the instruction as Itanium has. Workarounds
to attempt to address this introduce additional architectural dependencies
such as requiring instructions to be decoded to determine the operation. This
architectural dependency does not currently exist in interpreting the line
table.
This proposal extends the line table to explictly represent the address
of the operation within the instruction, without adding any architectural
dependencies to interpreting the line table. The meaning of the instruction
address value is unchanged.
Proposal:
Add the following to the State Machine Registers (section 6.2.2)
operation The operation number (for VLIW machines) corresponding
to the source line and column.
Add "operation 0" to the list of initial register values.
Add the following to the Line Number Program Header (section 6.2.4),
following item 8 and renumber items 9-11 as 11-13:
9. number_of_operations
This value is the number of operations contained in an instruction.
*For conventional processors, this may be zero or one. For VLIW
processors which have multiple operations in each instruction, this
value is the number of operations in each instruction.*
10. operation_sizes (array of ubyte)
There are number_of_opeations entries in this array. Each entry
gives the size of the corresponding operation in bits. Operations
are ordered within the instruction by increasing machine address.
Modify the Special Opcodes (section 6.2.5.1) as follows:
Insert the following after item 2 and renumber the following items:
3. Add a signed integer value to the operation register.
Modify the description of special opcode to read as follows:
A special opcode value is chosen based on the amount that needs to
be added to the line, address, and operation registers. The maximum
line increment for a special opcode is the value of the line_base field
in the header, plus the value of the line_range field, minus 1 (line
base + line range - 1). If the desired line increment is greater than
the maximum line increment, a standard opcode must be used instead
of a special opcode. The “address advance” is calculated by dividing
the desired address increment by the minimum_instruction_length field
times the number_of_operations (or one, if it is zero) from the header.
The special opcode is then calculated using the following formula:
opcode = (desired line increment - line_base) +
(line_range * max(number_of_operations, 1) * address advance) +
(line_range * desired operation increment) + opcode_base
If the resulting opcode is greater than 255, a standard opcode must be
used instead.
To decode a special opcode, subtract the opcode_base from the opcode
itself to give the adjusted opcode. The amount to increment the address
register is the result of the adjusted opcode divided by the line_range
multiplied by the minimum_instruction_length field and divided by the
number_of_operation (or one, if it is zero) field from the header. That is,
address increment = ((adjusted opcode / line_range) *
minimim_instruction_length) / max(number_of_operations, 1)
The amount to increment the operation register is the remainder of
this division:
operation increment = ((adjusted opcode / line_range) *
minimim_instruction_length) % max(number_of_operations, 1)
The amount to increment the line register is the line_base plus the
result of the adjusted opcode modulo the line_range. That is,
line increment = line_base + (adjusted opcode % line_range)
For conventional processors, the example on pages 99-100 should note that
the operation increment is zero for all opcodes. Another example can
be created for Itanium specifying the appropriate values for number_of_
operations and operation_sizes. The table describing this would have
an additional column headed "Operation advance". I'll prepare an
example at a later time.
Add the following to Standard Opcodes (section 6.2.5.2):
13. DW_LNS_set_operation
The DW_LNS_set_operation opcode takes a single unsigned LEB128
operand and stores that value in the operation register of the
state machine.
see also proposal 070426.2