Issue 150702.1: Add more unit type codes (DW_UT_*)

Author: Ron Brender
Champion: Ron Brender
Date submitted: 2015-07-02
Date revised:
Date closed:
Type: Improvement
Status: Withdrawn
DWARF Version: 5
Section 7.5.1, pg 186
This is a proposal to define unit type codes (DW_UT_*)
for each of the several kinds of unit defined in DWARF.
Recall that both DW_TAG_compile_unit and DW_TAG_type_unit
DIEs can occur in several variations. At present it is
necessary to decode at least part of the attributes of
the DIE to determine what variant is present. With this
proposal the existing unit_type field already present
in unit headers will provide a distinct code for each
variant.

Additional background is attached at the end of this proposal.
Note that this proposal matches P3 in that attachment.

Proposal
--------

In Section 7.5.1, define additional unit type codes as
follows:

    DW_UT_skeleton        0x04
    DW_UT_split_compile   0x05
    DW_UT_split_type      0x06
    DW_UT_split_partial   0x07

These correspond to unit variants in the obvious way.

Note: it is currently unclear whether or not we want to
allow/define "split partial" compilation units (that is,
allow inclusion of DW_TAG_partial_unit as a split unit).
If this is not allowed then the last entry can be omitted.

As an editorial matter, it would be nice to add a third
row to the table that names the unit kind. Thoughts?


==============================================================
Brender email of 06-24-2015 "Unit Types Spectrum of Proposals"
--------------------------------------------------------------
This is a follow-up to my question/suggestion in E150516.3
about adopting additional unit type codes to identify all 
of the distinct kinds of units we have in play. It is NOT 
a proposal. Rather I here try to lay out a range of possibilities 
and hope to generate some discussion which can hopefully quide
what exact proposal should come next.

Review
======

Currently we have two unit header layouts. Both have the
following five fields:

unit_length     4 or 12 bytes
version         uhalf
unit_type       ubyte
debug_abbrev    4 or 8
address_size    ubyte

For DW_TAG_compile_unit and DW_TAG_partial_unit that is
the complete header.

For DW_TAG_type_unit, two additional fields are added:

type_signature  8 bytes
type_offset     4 or 8 bytes

Possible Proposals
==================

A spectrum of proposals includes the following:

P0: Unify to get one header layout for all
P1: Drop the type code for partial units as 
    redundant/unuseful
P2: No change
P3: Add new codes for skeleton unit, split compilation unit, 
    split partial unit & split type unit
P4: P3 + specialize the header for each

Proposal P0: Unify to get one header layout
-------------------------------------------

Adopt the type header layout as *the* header layout.
For compilation units and partial units, the last
two fields would just be padding (zero bytes).

Possible proposal P0A would then eliminate the unit_type 
field as well; the function of this field can still be performed 
by looking at the tag of the first DIE. The abbreviation
code is always at a known place but getting the tag requires
looking at the abbreviation table. Feels like a lot of
complexity to save a single byte so I'll drop this idea now.

Proposal P1: Drop the type code for partial units
-------------------------------------------------

Do we really need DW_UT_partial? Probably not. (We probably
don't even need DW_TAG_partial_unit any more given the
newer space saving techniques, but let's not go there...)

Maybe tinker with the unit type names: how does DW_UHL_1
& DW_UHL_2 grab ya? (UHL => unit header layout)

Proposal P2: No change
----------------------

Easy. But probably the least defensible. We have one
unit type code that we probably don't need and lack
others that might actually be useful. Not good...

Proposal P3: Add new codes for skeleton unit, split compilation 
    unit, split partial unit & split type unit
---------------------------------------------------------------

Simple, cheap. The hardest part may be picking names. One
obvious suggestion:

[ DW_UT_compile, DW_UT_partial & DW_UT_type as now +]
  DW_UT_skeleton
  DW_UT_split_compile
  DW_UT_split_partial
  DW_UT_split_type


Proposal P4: Do P4 then exploit
-------------------------------
Make the headers "fit" the needs of each unit type
to wit:

DW_UT_compile, DW_UT_partial, DW_UT_type: same as now.

DW_UT_split_type: same as DW_UT_type.

DW_UT_skeleton, DW_UT_split_compile: Define
a new header variant that is between the current
two--ie, uses the type_signature field which
then allows elimination of the DW_AT_dwo_ID
attribute. (Name the new field unit_signature, or
maybe unit_ID.)


Pick Your Poison?
=================

Among this set I think my own order of preference is
P4, P3, P1, P0, P2. ANY of the other alternatives
are better than the status quo! And best of all is
to add and take advantage of new type codes...

Other thoughts?

Ron

--

5/17/2016 -- Withdrawn (resolved by 160108.1)