Issue 150607.1: Add DW_FORM_base_offset to resolve split object file issues

Author: Ron Brender
Champion: Ron Brender
Date submitted: 2015-06-07
Date revised:
Date closed:
Type: Error
Status: Withdrawn
DWARF Version: 5
Section Many, pg Many
Note: This proposal is motivated by Jian's "editorial"
issue 150303.4 regarding DW_FORM_sec_offset relocations
in normal vs split objects. The proposed resolution is
substantive, however: Introduction of a new form.

The current handling of rangelistptr is confusing (perhaps
even confused). There are three sections that "live" in a
skeleton unit and that can be referred to in the corresponding
split DWARF object file:

     .debug_addr
     .debug_str_offsets
     .debug_ranges

Each has an attribute that is specified on the skeleton unit
to provide an appropriate base for references (in lieu of
using the base of an implied section), namely:

     DW_AT_addr_base
     DW_AT_str_offsets_base
     DW_AT_ranges_base

For the address and string offsets table, new forms were introduced
to use these base addresses. But not for ranges!

For the DW_AT_ranges attribute, there is only one defined form,
namely DW_FORM_sec_offset. That form normally is used when there
is an known implied section together with an offset that applied
to the base of that section. "It is relocatable in a relocatable
object file, and relocated in an executable or shared object
file" (p199). But now we have split DWARF object files without
relocations but with a DW_AT_ranges_base attribute--oops!

One solution is to special case this to say that in a split
DWARF object file, the form acts rather like DW_FORM_addrx but
without the indirection. There are two problems:
  1) Consumer processing code needs to know what kind of
     compilation unit an arbitrary DW_AT_ranges attribute is
     contained in. This may have significant implementation
     implications.
  2) The offset value has the size of an address. But in this
     application, an unsigned ULEB is preferable. This wastes
     space for no reason.

A better solution is to introduce a new form patterned after
DW_FORM_addrx and DW_FORM_strx in part, but without the
indexing and indirection aspect. I propose DW_FORM_base_offset.
Like DW_FORM_sec_offset, the choice of base is implied by
the attribute in which it is used. (At present only one
base in one attribute is proposed/defined, but more could
be added in the future.) The value is simply that
base plus the offset, represented as a ULEB integer.



Proposal
--------

1)  S1.4, p4: Add DW_FORM_base_offset to list of new forms.

2)  S3.1.1, p59: Change DW_FORM_sec_offset to DW_FORM_base_offset.

3)  S7.3.1, p173: Delete the exception at the end of the third bullet.

4)  S7.3.2, p175: Add new paragraph to first bullet of first list:

     If a .debug_ranges section is present and it is referenced
     using the rangelistptr form DW_FORM_base_offset, then the
     compilation unit includes a DW_AT_ranges_base attribute
     whose value is the first address of that section.

5)  S7.5.4, p199: Replace (augment) description of rangelistptr:

      A rangelistptr refers to a range list. There are two types:

       - [[Use existing text: "This is..."]]

       - This is an offset relative to the base of a range list
         table specified by the DW_AT_ranges_base attribute of
         the associated compilation unit. The offset value is
         represented as an unsigned LEB128 value.

6)  Table 7.6, p203: Add new row at end of table:

     DW_FORM_base_offset    0x22    rangelistptr

7)  Appendix B, Note (i) to Fig. B.1: add "or DW_FORM_base_offset"
     following DW_FORM_sec_offset.

8)  Appendix B, Note (i) to Fig. B.2: add "or DW_FORM_base_offset"
     following DW_FORM_sec_offset.


--

8/2/2016 -- Withdrawn, see 160714.1.