Issue 140906.1: Two-Level Line Tables

Author: Cary Coutant
Champion: Cary Coutant
Date submitted: 2014-09-06
Date revised:
Date closed:
Type: Enhancement
Status: Open
DWARF Version: 6

Section 6.2, pg 128 This proposal introduces a new two-level line number table to DWARF, splitting the current line number table into two parts: a "logicals" table, and an "actuals" table. The actuals table would be optional, and when omitted, the logicals table would correspond to the DWARF v4 line number table.

In a two-level line number table, the logicals table would provide a mapping from each logical statement in a program to a recommended breakpoint location, and the actuals table would provide a mapping from PC location to a logical statement represented in the logicals table. The separation allows the line number table to represent directly the nesting of inline functions so that consumers would not need to parse the DIE tree in order to show inlined calls.

For further background information and possible future extensions, see the DWARF Wiki page:

http://wiki.dwarfstd.org/index.php?title=TwoLevelLineTables

In Section 6.2 ("Line Number Information"), add the following non-normative text at the end of the section:

For optimized code, where instruction scheduling may reorder
instructions across statement boundaries, where loop
optimizations may clone code and move instructions across
loop iterations, and where inlining may introduce
instructions from an inlined call into the instruction
stream, an alternate representation with two matrices may be
used. One matrix (the "logical line table") would have a row
for each "logical" source statement. If the emitted object
code for a source statement has been cloned into two
different locations, there would be a separate row for each.
The code for a source statement in an inlined function would
have a separate row for each instance, and would refer to its
calling context -- i.e., the logical source statement within
which it was called. The matrix would have columns for:

* the source file name
* the source line number
* the source column number
* the function name
* recommended breakpoint location for this statement
* the calling context (a reference to another row)
* and so on

A second matrix (the "actual line table") would have a row
for each instruction in the emitted object code, and would
have columns for:

* the logical statement (a reference to a row in the first
matrix)
* whether this instruction is the beginning of a basic
block
* and so on

Both of these matrices can be stored using the same byte-code
language mentioned above.

In Section 6.2.2 ("State Machine Registers"), add the following rows to Table 6.2:

Register name    Meaning
------------------------------------------------------------
context          An unsigned integer indicating a row number
                 in the logical line table that represents
                 the calling context for the current
                 logical statement. Rows are numbered
                 beginning at 1. The value 0 indicates that
                 the current logical statement is not part
                 of an inlined call.

function_name    An unsigned integer representing the name
                 of the function containing the current
                 logical statement. This value is either a
                 string offset relative to the start of the
                 .debug_str section, or the index of an
                 entry in the .debug_str_offsets section,
                 depending on a flag in the line number
                 program header (see Section 6.2.4).
------------------------------------------------------------

Following Table 6.2, add the following notes:

When only a single line number table is used, the context
register is not used.

When both a logical line table and an actual line table are
used, the basic_block and isa registers are not used in the
logical line table. The file, column, is_stmt, prologue_end,
epilogue_begin, discriminator, context, and function_name
registers are not used in the actual line table. In the
actual line table, the line register represents a row in the
logicals table rather than an actual line number. Rows in the
logical line table are numbered starting at 1.

Following Table 6.2, add the following rows to the (unnumbered) table showing the state of the registers at the beginning of each sequence:

context          0
function_name    0

In Section 6.2.4 ("The Line Number Program Header"), modify item 3, insert items 4 and 5, and renumber all subsequent items:

3.  header_length
    The number of bytes following the header_length field to
    the first byte of the line number program for the logical
    line table (or for the line number table is there is no
    actuals table). In the 32-bit DWARF format, this field is
    a 4-byte unsigned length; in the 64-bit DWARF format,
    this field is an 8-byte unsigned length (see Section 7.4).

4.  actuals_table_offset
    The offset in bytes from the end of the header to the
    first byte of the line number program for the actual line
    table. If there is only one line number table, this field
    is 0. In the 32-bit DWARF format, this field is a 4-byte
    unsigned length; in the 64-bit DWARF format, this field
    is an 8-byte unsigned length (see Section 7.4).

5.  function_name_form (ubyte)
    The format of the function_name register. If this value
    is DW_FORM_strp, the function_name register contains an
    offset relative to the start of the .debug_str section.
    If this value  is DW_FORM_strx, the function_name
    register contains an index of an entry in the
    .debug_str_offsets table.

(In Section 6.2.5.1, "Special Opcodes", no change is necessary, unless we want to explicitly point out that the new context and function_name registers are left unchanged by the special opcodes.)

In Section 6.2.5.2 ("Standard Opcodes"), add the following new opcode:

 13. DW_LNS_inlined_call
     The DW_LNS_inlined_call opcode takes two unsigned LEB128
     operands. It sets the context register to the value of
     the first operand, and the function_name register to the
     value of the second operand.

     *This opcode may also be used to set the context register
     to zero, to indicate that a logical statement is not part
     of an inlined call.*

In Section 6.2.5.3 ("Extended Opcodes"), add the following new opcode:

6.  DW_LNE_set_function_name
    The DW_LNE_set_function_name opcode takes a single
    unsigned LEB128 operand. It sets the function_name
    register to the new value.

10/21/2014 - Deferred to DWARF Version 6