Issue 160123.1: Unify Location Lists and Range Lists

Author: Ron Brender
Champion: Ron Brender
Date submitted: 2016-01-23
Date revised:
Date closed:
Type: Enhancement
Status: Accepted
DWARF Version: 5
Section 2.6.2, 2.17.3, pg 


This is Revision 1 to Issue 160123.1. Changes are noted at
the end, following the TEXTUAL PROPOSAL. These changes are
enabled by a companion proposal 1607xx.1 that supersedes
Issue 150607.1 and which should be considered first.


This proposal is inspired by Cary's 12/16/2015 editorial
rewrite of Sections 2.6.2 and 2.17.3. The goals are to
unify the location list representations used in split and
non-split units, to maintain consistency between location
lists and range lists, and to bring the significant
advantages of the new representation in split units to
non-split units as well. (See my email of 1/5/2015, "Re:
Rewrite of Sections 2.6.2 (location lists) and 2.17.3
(range lists)" for further discussion.)

Basically, the approach is to:
 a) use the new DW_LLE_* representation of location lists
    defined for split units as a replacement for the
    current representation in non-split units,
 b) define a new DW_RLE_* representation, modeled after
    the DW_LLE_* representation, as a replacement for the
    current range lists representation, and
 c) augment both lists of entry operations with additional
    entry kinds that are useful in the non-split environment.
    In particular, these avoid forcing producers to support
    indirect address techniques (a la DW_FORM_addrx) just for
    these lists (in particular, in the absence of split unit
    support).
 d) These new representations are contained in new sections
    named .debug_loclists and .debug_rnglists, which supersede
    .debug_loc and .debug_ranges, respectively.


TEXTUAL PROPOSAL
----------------

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

S2.6.2, P38-42, L648-786: Replace as follows:

    2.6.2 Location Lists
    Location lists are used in place of location descriptions whenever
    the object whose location is being described can change location
    during its lifetime. Location lists are contained in a separate
    object file section called .debug_loclists (or .debug_loclists.dwo
    for split DWARF object files).

    A location list is indicated by a location or other attribute
    whose value is of class loclistx (see Section 7.?).
   
    *This location list representation, the loclistx class, and the
    related DW_AT_loclists_base attribute are new in DWARF Version 5.
    Together they eliminate most or all of the object language relocations
    previously needed for location lists.*

    A location list consists of a series of location list entries.
    Each location list entry is one of the following kinds:

      o  Bounded location description. This kind of entry provides a
         location description that specifies the location of
         an object that is valid over a lifetime bounded
         by a starting and ending address. The starting address is the
         lowest address of the address range over which the location
         is valid. The ending address is the address of the first
         location past the highest address of the address range.
         When the current PC is within the given range, the location
         description may be used to locate the specified object.
        
         There are several kinds of bounded location description
         entries which differ in the way that they specify the
         starting and ending addresses.
        
         The address ranges defined by the bounded location descriptions
         of a location list may overlap. When they do, they describe a
         situation in which an object exists simultaneously in more than
         one place. If all of the address ranges in a given location
         list do not collectively cover the entire range over which the
         object in question is defined, and there is no following default
         location description, it is assumed that the object is not
         available for the portion of the range that is not covered.

      o  Default location description. This kind of entry provides a
         location description that specifies the location of
         an object that is valid when no bounded location description
         applies.

      o  Base address. This kind of entry provides an address to be
         used as the base address for beginning and ending address
         offsets given in certain kinds of bounded location description.
         The applicable base address of a bounded location description
         entry is the address specified by the closest preceding base
         address entry in the same location list. If there is no
         preceding base address entry, then the applicable base address
         defaults to the base address of the compilation unit (see
         Section 3.1.1 on page 60).

         In the case of a compilation unit where all of the machine
         code is contained in a single contiguous section, no base
         address entry is needed.

      o  End-of-list. This kind of entry marks the end of the
         location list.

    A location list consists of a sequence of zero or more bounded
    location description or base address entries, optionally followed
    by a default location entry, and terminated by an end-of-list
    entry.
   
    Each location list entry begins with a single byte identifying
    the kind of that entry, followed by zero or more operands depending
    on the kind.   
   
    In the descriptions that follow, these terms are used for operands:
   
      . A <def>counted location description<\def> operand consists of a
        two-byte unsigned integer giving the length of the location
        description (see Section 2.6.1) that immediately follows.
   
      . An <def>address index<\def> operand is the index of an address
        in the .debug_addr section. This index is relative to the
        value of the DW_AT_addr_base attribute of the associated
        compilation unit. The address given by this kind
        of operand is *not* relative to the compilation unit base address.
   
      . A <def>target address<\def> operand is an address on the target
        machine. (Its size is the same as used for attribute values of
        class address, specifically, DW_FORM_address.)
       
    The following entry kinds are defined:

    *The initial kinds of entries in this list can be used in either
    split or non-split units.*
   
     1. DW_LLE_end_of_list
        An end-of-list entry contains no further data.
       
        *A series of this kind of entry may be used for padding or
        alignment purposes.*

     2. DW_LLE_base_addressx
        This is a form of base address entry that has one unsigned
        LEB128 operand. The operand value is an address index that
        indicates the applicable base address used by DW_LLE_offset_pair
        entries.

     3. DW_LLE_startx_endx
        This is a form of bounded location description entry that
        has two unsigned LEB128 operands. The operand values are
        address indices. These indicate the
        starting and ending addresses, respectively, that define
        the address range for which this location is valid.
        These operands are followed by a counted location description.

    4.  DW_LLE_startx_length
        This is a form of bounded location description that has two
        unsigned ULEB operands. The first value is an address index
        that indicates the beginning of the address range over
        which the location is valid.
        The second value is the length of the range.
        These operands are followed by a counted location description.

    5.  DW_LLE_offset_pair
        This is a form of bounded location description entry that
        has two unsigned LEB128 operands. The values of these
        operands are the starting and ending offsets, respectively,
        relative to the applicable base address, that define the
        address range for which this location is valid.
        These operands are followed by a counted location description.
       
     6. DW_LLE_default_location
        This entry has no range operands that express a range of
        addresses. The only operand is a counted location description.
        
    *The following kinds of location list entries are defined for
    use only in non-split DWARF units.*
   
     7. DW_LLE_base_address
        A base address entry has one target address operand.
        This address is used as the base address when interpreting
        offsets in subsequent location list entries of kind
        DW_LLE_offset_pair.
       
     8. DW_LLE_start_end
        This is a form of bounded location description entry that
        has two target address operands. These indicate the
        starting and ending addresses, respectively, that define
        the address range for which the location is valid.
        These operands are followed by a counted location description.
       
     9. DW_LLE_start_length
        This is a form of bounded location description entry that
        has one target address operand value and an unsigned LEB128
        integer operand value. The address is the beginning address
        of the range over which the location description is valid, and
        the length is the number of bytes in that range.
        These operands are followed by a counted location description.
   
       
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++      

S2.17.3, P49-51, L968-1022: Replace this section with the following:

    2.17.3 Non-Contiguous Address Ranges

    Range lists are used when the set of addresses for a debugging
    information entry cannot be described as a single contiguous range.
    Range lists are contained in a separate object file section
    called .debug_rnglists.
 
    A range list is identified by a ranges or other
    attribute whose value is of class
    rnglistx (see Section 7.?).
 
    *This range list representation, the rnglistx class, and the
    related DW_AT_rnglists_base attribute are new in DWARF Version 5.
    Together they eliminate most or all of the object language relocations
    previously needed for range lists.*

    Each range list entry is one of the following kinds:

     o  Bounded range. This kind of entry defines an address range
        that is included in the range list. The starting address is
        the lowest address of the address range. The ending address
        is the address of the first location past the highest address
        of the address range.
       
        There are several kinds of bounded range entries which specify
        the starting and ending addresses in different ways.

     o  Base address. This kind of entry provides an address to be
        used as the base address for the beginning and ending
        address offsets given in certain bounded range entries. The
        applicable base address of a range list entry is
        determined by the closest preceding base address
        entry in the same range list. If there is no preceding
        base address entry, then the applicable base address
        defaults to the base address of the compilation unit (see
        Section 3.1.1 on page 60).

        In the case of a compilation unit where all of the machine
        code is contained in a single contiguous section, no base
        address entry is needed.

     o  End-of-list. This kind of entry marks the end of the range
        list.

    Each range list consists of a sequence of zero or more bounded
    range or base address entries, terminated by an end-of-list entry.

    A range list containing only an end-of-list entry describes an
    empty scope (which contains no instructions).
   
    Bounded range entries in a range list may not overlap. There is
    no requirement that the entries be ordered in any particular way.

    A bounded range entry whose beginning and ending address offsets
    are equal (including zero) indicates an empty range and may be
    ignored.
   
    Each range list entry begins with a single byte identifying the kind
    of that entry, followed by zero or more operands depending on the
    kind.

    In the descriptions that follow, the term <def>address index<\def>
    means the index of an address in the .debug_addr section. This
    index is relative to the value of the DW_AT_addr_base attribute
    of the associated compilation unit. The address given by this kind
    of operand is *not* relative to the compilation unit base address.
    
    The following entry kinds are defined:
   
    *The initial kinds of entries in this list can be used in either
    split or non-split units.*
   
     1. DW_RLE_end_of_list
        An end-of-list entry contains no further data.

        *A series of this kind of entry may be used for padding or
        alignment purposes.*
      
     2. DW_RLE_base_addressx
        A base address entry has one unsigned LEB128 operand.
        The operand value is an address index that indicates
        the applicable base address used by following DW_RLE_offset_pair
        entries.
    
     3. DW_RLE_startx_endx
        This is a form of bounded range entry that
        has two unsigned LEB128 operands. The operand values are
        address indices that indicate the
        starting and ending addresses, respectively, that define
        the address range.

     4. DW_RLE_startx_length
        This is a form of bounded location description that
        has two unsigned ULEB operands. The first value is an address index
        that indicates the beginning of the address range.
        The second value is the length of the range.
       
     5. DW_RLE_offset_pair
        This is a form of bounded range entry that
        has two unsigned LEB128 operands. The values of these
        operands are the starting and ending offsets, respectively,
        relative to the applicable base address, that define the
        address range.
       
    *The following kinds of range entry may be used only in non-split
    units.*

     6. DW_RLE_base_address
        A base address entry has one target address operand.
        This operand is the same size as used in DW_FORM_address.
        This address is used as the base address when interpreting
        offsets in subsequent location list entries of kind
        DW_RLE_offset_pair.
       
     7. DW_RLE_start_end
        This is a form of bounded range entry that
        has two target address operands. Each
        operand is the same size as used in DW_FORM_address.
        These indicate the starting and ending addresses,
        respectively, that define the address range for which
        the following location is valid.
       
     8. DW_RLE_start_length
        This is a form of bounded range entry that
        has one target address operand value and an unsigned LEB128
        integer length operand value. The address is the beginning address
        of the range over which the location description is valid, and
        the length is the number of bytes in that range.

   
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

***  The above two changes express the primary substance of this
***  proposal. The remainder that follows simply cleans up the
***  editorial consequences that result.

Whole document: Unless otherwise specified, make the following
replacements throughout:

    Replace                             Replacement
    --------------                      -----------
    .debug_loc                          .debug_loclists
    .debug_loc.dwo                      .debug_loclists.dwo
    .debug_ranges                       .debug_rnglists
    DW_LLE_end_of_list_entry            DW_LLE_end_of_list
    DW_LLE_base_address_selection_entry DW_LLE_base_addressx
    DW_LLE_start_end_entry              DW_LLE_startx_endx
    DW_LLE_start_length_entry           DW_LLE_startx_length
    DW_LLE_offset_pair_entry            DW_LLE_offset_pair
    "base address selection entry"      "base address entry".
   
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

S7.3.1, P180, L102: Delete ".debug_loc, .debug_ranges".

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

S3.5, P90, L1112: Replace "base selection entry, default selection
entry" with "base address entry, default location entry".

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

S7.3.5.3, P187, T7.1: Replace "DW_SECT_LOC" with "DW_SECT_LOCLISTS".

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

S7.4, P191,  L430 (7.): Replace ", .debug_loc and .debug_ranges
sections" with "section".

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

S7.7.3, P215, L871: Replace "base address selection" with
"base address entry, a default location".

S7.7.3.1, P215-216, L853-862: Delete entire section.

S7.7.3.2, P216, L863-864: Delete the header and the first sentence.
Then the second sentence, "Each entry begins...", becomes the
start of the second paragraph of 7.7.3.

S7.7.3, P216, T7.10: Add these entries in the table

          DW_LLE_default_location    |  0x05
          DW_LLE_base_address        |  0x06
          DW_LLE_start_end           |  0x07
          DW_LLE_start_length        |  0x08

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

S7.25, P229, L982-986: Replace the second and third paragraphs
with:

    Each entry begins with an unsigned 1-byte code that indicates
    the kind of entry that follows. The encodings for these constants
    are given in Table 7.x.
   
        Table 7.x: Range list entry encoding values
          --------------------------------------
          Range list entry encoding name | Value   
          --------------------------------------
          DW_RLE_end_of_list             |  0x00
          DW_RLE_base_addressx           |  0x01
          DW_RLE_startx_endx             |  0x02
          DW_RLE_startx_length           |  0x03
          DW_RLE_offset_pair             |  0x04
          DW_RLE_base_address            |  0x05
          DW_RLE_start_end               |  0x06
          DW_RLE_start_length            |  0x07
   
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

SD.13, P336, FD.61: Replace the .debug_loc description with the
following:

    ! .debug_loclists section
    98$:DW_LLE_start_end, <label0 in main>, <label1 in main>
          DW_OP_lit1 DW_OP_stack_value DW_OP_piece(4)
          DW_OP_lit2 DW_OP_stack_value DW_OP_piece(4)
        DW_LLE_start_end, <label1 in main>, <label2 in main>
          DW_OP_lit2 DW_OP_stack_value DW_OP_piece(4)
          DW_OP_lit2 DW_OP_stack_value DW_OP_piece(4)
        DW_LLE_start_end, <label2 in main>, <label3 in main>
          DW_OP_lit2 DW_OP_stack_value DW_OP_piece(4)
          DW_OP_lit3 DW_OP_stack_value DW_OP_piece(4)
        DE_LLE_end_of_list
    99$:DW_LLE_start_end, <label1 in main>, <label2 in main>
          DW_OP_implicit_pointer(reference to 40$, 0)
        DW_LLE_start_end, <label2 in main>, <label3 in main>
          DW_OP_implicit_pointer(reference to 40$, 4)
        DW_LLE_end_of_list

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

SF.1, P378, L40 (.debug_loc.dwo): Delete ", with a slightly
modified...relocations".

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Appendix G, TG.1: Replace the line for .debug_loc with

      .debug_loc        *  *  *  -
      .debug_loclists   -  -  -  V5

and replace the line for .debug_ranges with

      .debug_ranges     *  *  *  -
      .debug_rnglists   -  -  -  V5
     
In the .dwo sequence, replace .debug_loc.dwo with .debug_loclists.dwo
and add

      .debug_rnglists.dwo  - - - V5

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Index: Confirm that none of the following linger:
    .debug_loc
    .debug_loc.dwo
    .debug_ranges
    selection entry
   

CHANGES FROM THE ORIGINAL PROPOSAL
----------------------------------

The main change in this Revision 1 is to remove the DW_LLE_startx_length4,
DW_LLE_offset_pair4, DW_RLE_startx_length4 and DW_RLE_offset_pair4 entries.

The main feature of interest with these four entries is they all have a
(second) operand that is an unsigned offset with a fixed size of 4 bytes.
The motivation for the fixed size was to facilitate certain kinds of late
optimization (either during compilation or linking) that may require
modification to either the entry itself or to the DWARF attribute that
refers to the entry. This late editing problem is now eliminated by changes
that are detailed in companion Revision 1  of Issue 150607.1.

The four deleted entries are replaced by their corresponding variable
size entry (which use ULEB operands); these are now usable in both split
and non-split units. The replacements are not new to this proposal;
they were previously present and limited for use only in non-split units,
but this limitation no longer applies.

Note also that the June meeting discussion indicated a need to add
headers for the .debug_loclists and .debug_rnglists sections in which
these entries are contained--however, this was mistaken because this had
already been done for V5. Changes to these headers are moreover proposed
in the companion Issue 150707.1 Revision.

--

07/12/2016 -- Revised.  Previous version: http://dwarfstd.org/issues/160123.1-1.html
08/02/2016 -- Accepted (requires 160714.1).