Issue 070519.1: Line table support for VLIW

Author: Michael Eager
Champion: Michael Eager
Date submitted: 2007-05-19
Date revised:
Date closed:
Type: Extension
Status: Rejected
DWARF Version: 4
Background:

At an architectural level, most processors have instructions which contain
a single instruction.  If the hardware is able to issue multiple operations
at the same time, this is hidden from the user, so that there appears to be
a one-to-one correspondence between instructions and operations.  The 
hardware selects which operations from which instructions can be executed
at the same time, and the user has little or no control over which operations 
will be executed at the same time.

VLIW processors reveal the hardware's ability to issue multiple operations
at the same time by packaging several operations into a single instruction.
The user (i.e., compiler) is responsible for selecting operations which 
the hardware is to issue at the same time.  There are a few examples of 
VLIW systems: Intel Itanium, Xtensa LX2, TI C6000, STMicro ST200, NXP TriMedia.

In both conventional and VLIW processors, instructions are the smallest
unit which can be the target of a branch address.  In conventional processors,
instructions may be fixed or variable length.  On some processors, there are
multiple sizes of instructions, although each size is fixed (e.g. ARM/Thumb). 
In VLIW processors, this is less common, although Xtensa LX2 does allow 
processor configurations with different sized instructions. 

The DWARF line table (Section 6.2) provides a mapping between instruction
addresses and the source line (and column) associated with the instrucion.
Using the line table, a DWARF consumer can determine the start address of
each instruction, the instruction length, and certain other characteristics
such as whether the instruction represents the start of a source statement,
end of the prologue, etc.  

With one minor exception, there are no architectural dependencies in the 
line table.  This execption is the ISA value, used for processors which 
support multiple instruction sets.  This does not affect the mapping between
instruction address and source line and is only provided to inform a
disassembler which of the possible ISAs is in use at the location.  

[The minimum_instruction_length is not an architectural dependency; it is
simply a multiplier used to generate a slightly more compact representation of
the line data.  All conventional processors can be represented with this
value set to one, although a the price of slightly larger line data.]

This is not satisfactory for VLIW processors, since the multiple operations
in an instruction may be related to different source lines.  A means is
needed to represent not just instructions, but the individual operations
within an instruction.

Existing compilers for Itanium extend the interpretation of the instruction 
address in an architecture-specific fashion to represent the individual 
operations in each instruction and introduces an implicit dependency between 
the line table and the architecture.  The address value no longer represents
the machine address, but is a composition of the address and a value 
representing an index to the operation within the instruction that it refers to.
Proposal 070426.2 proposes to codify this as part of DWARF.  

This method does not appear to be general and in particular, it does not
appear that it would work for the Xtensa LX2 processor line which does not
have a pre-defined format for the instruction as Itanium has.   Workarounds
to attempt to address this introduce additional architectural dependencies 
such as requiring instructions to be decoded to determine the operation.  This
architectural dependency does not currently exist in interpreting the line 
table.

This proposal extends the line table to explictly represent the address
of the operation within the instruction, without adding any architectural 
dependencies to interpreting the line table.  The meaning of the instruction
address value is unchanged.  

Proposal:

Add the following to the State Machine Registers (section 6.2.2)

  operation    The operation number (for VLIW machines) corresponding
               to the source line and column. 

  Add "operation 0" to the list of initial register values.  

Add the following to the Line Number Program Header (section 6.2.4), 
following item 8 and renumber items 9-11 as 11-13:

  9.  number_of_operations
      This value is the number of operations contained in an instruction.
      *For conventional processors, this may be zero or one.  For VLIW
      processors which have multiple operations in each instruction, this
      value is the number of operations in each instruction.*

  10. operation_sizes (array of ubyte)
      There are number_of_opeations entries in this array.  Each entry 
      gives the size of the corresponding operation in bits.  Operations
      are ordered within the instruction by increasing machine address.  

Modify the Special Opcodes (section 6.2.5.1) as follows:

  Insert the following after item 2 and renumber the following items:

  3.  Add a signed integer value to the operation register.

Modify the description of special opcode to read as follows:

   A special opcode value is chosen based on the amount that needs to 
   be added to the line, address, and operation registers.  The maximum 
   line increment for a special opcode is the value of the line_base field 
   in the header, plus the value of the line_range field, minus 1 (line 
   base + line range - 1). If the desired line increment is greater than 
   the maximum line increment, a standard opcode must be used instead 
   of a special opcode. The “address advance” is calculated by dividing 
   the desired address increment by the minimum_instruction_length field 
   times the number_of_operations (or one, if it is zero) from the header. 
   The special opcode is then calculated using the following formula:

      opcode = (desired line increment - line_base) +
           (line_range * max(number_of_operations, 1) * address advance) + 
       (line_range * desired operation increment) + opcode_base

   If the resulting opcode is greater than 255, a standard opcode must be 
   used instead.

   To decode a special opcode, subtract the opcode_base from the opcode 
   itself to give the adjusted opcode. The amount to increment the address 
   register is the result of the adjusted opcode divided by the line_range 
   multiplied by the minimum_instruction_length field and divided by the 
   number_of_operation (or one, if it is zero) field from the header. That is,
   
      address increment = ((adjusted opcode / line_range) * 
         minimim_instruction_length) / max(number_of_operations, 1)

   The amount to increment the operation register is the remainder of
   this division:

      operation increment = ((adjusted opcode / line_range) * 
         minimim_instruction_length) % max(number_of_operations, 1)

   The amount to increment the line register is the line_base plus the 
   result of the adjusted opcode modulo the line_range. That is,
   line increment = line_base + (adjusted opcode % line_range)

For conventional processors, the example on pages 99-100 should note that
the operation increment is zero for all opcodes.  Another example can 
be created for Itanium specifying the appropriate values for number_of_
operations and operation_sizes.  The table describing this would have 
an additional column headed "Operation advance".  I'll prepare an 
example at a later time.  

Add the following to Standard Opcodes (section 6.2.5.2):

   13.  DW_LNS_set_operation

      The DW_LNS_set_operation opcode takes a single unsigned LEB128
      operand and stores that value in the operation register of the 
      state machine. 

 
see also proposal 070426.2