Issue 110722.1: More compact macro information - .debug_macro section
BACKGROUND
The current .debug_macinfo section doesn't allow any merging of redundant information, so e.g.
on our compiler when built with an option to request .debug_macinfo generation, to roughly 100MB
of code/data plus all DWARF sections except .debug_macinfo there is ~350MB .debug_macinfo section.
OVERVIEW
The following proposal replaces that ~350MB .debug_macinfo section with ~1MB .debug_macro section
and ~1.5MB of new content in .debug_str section.
It has been implemented in GCC (starting at version 4.7) as a GNU extension (.debug_macro section
there is referenced through vendor DW_AT_GNU_macros instead of DW_AT_macros and has version 4,
where presumably DWARF 5 will start at version 5).
This proposal adds a new operand encoding of the define/undef opcodes,
where the second operand is a reference into .debug_str section. This is
useful for longer strings than the size of the reference, because it allows
sharing of the strings from different compilation units, or even within one
compilation unit.
Further, though already relatively smaller, savings are achieved by adding
an opcode which references a sequence of macinfo entries, where the consumer
is supposed to substitute the macinfo entry with that sequence of entries.
This again allows sharing the opcodes in between different compilation units
or even within the same compilation unit.
The proposal adds a new section, .debug_macro, that replaces .debug_macinfo
instead of adding new opcodes to the .debug_macinfo section. The new section
has short headers that ease parsing of the section and allow making incompatible
changes. There is a mechanism for defining extension opcode arguments as well,
so consumers may skip them when parsing the section.
PROPOSAL
Here are preliminary DWARF edits for the proposal:
2.2
Change "Macro information (#define, #undef)" to "Legacy macro information (#define, #undef)".
Add
"DW_AT_macros Macro information (#define, #undef)"
row to the figure 2.
3.1.1
5. Replace "DW_AT_macro_info" with "DW_AT_macros".
Replace ".debug_macinfo" with ".debug_macro".
Add at the end of the paragraph
"The DW_AT_macro_info attribute instead might refer to the .debug_macinfo
section as defined in DWARF version 4."
6.3
Replace ".debug_macinfo" with ".debug_macro".
Replace:
"The macro information for each compilation unit is represented as a series
of 'macinfo' entries. Each macinfo entry consists of a 'type code'
and up to two additional operands."
with:
"The macro information for each compilation unit starts with a section
header followed by a series of 'macinfo' entries. Each macinfo entry
consists of a 'type code' and zero or more operands."
Add at the end:
"The section header starts with a 2-byte version number, followed by
1-byte flags value. If the least significant bit (bit 0) in the flags is
cleared, the header is for 32-bit DWARF format macro section and offsets are 4 byte long,
if it is set, the header is for 64-bit DWARF format macro section and offsets are 8 byte long.
If the second least significant bit (bit 1) in the flags is set,
the flags byte is followed by an offset in the .debug_line section of the
beginning of the line number information, encoded as 4 byte offset for
32-bit DWARF format macro section and 8 byte offset for 64-bit DWARF format
macro section. If the third least significant bit (bit 2) in the flags is set,
the .debug_line offset (if present) or flags byte (otherwise) is followed by a table
describing arguments of the macinfo entry types.
The macinfo entry types defined in this standard may, but might not, be
described in the table, other macinfo entry types used in the section
should be described there. Vendor extension macinfo entry types should be
allocated in the range from DW_MACRO_lo_user to DW_MACRO_hi_user, other
unassigned codes are reserved for future DWARF standards.
The table starts with a 1-byte count of the defined opcodes, followed by
an entry for each of those opcodes. Each entry starts with a 1-byte
opcode number, followed by unsigned LEB128 encoded number of arguments
and for each argument there is a single byte describing the form in which
the argument is encoded. The allowed values are DW_FORM_data1,
DW_FORM_data2, DW_FORM_data4, DW_FORM_data8, DW_FORM_sdata, DW_FORM_udata,
DW_FORM_block, DW_FORM_block1, DW_FORM_block2, DW_FORM_block4, DW_FORM_flag,
DW_FORM_string, DW_FORM_strp, DW_FORM_sec_offset and DW_FORM_strx.
The table allows a consumer to skip over unknown macinfo entry types.
*The DWARF 4 and earlier standards used a different section for the
macinfo entries, the .debug_macinfo section, which didn't contain any
section headers and didn't have the possibility for indirect string
encodings or transparent includes. This section may be used instead
of or alongside with the DWARF 5 .debug_macro section, as long as each
compilation unit refers to at most one of these, using DW_AT_macros
attribute to refer to .debug_macro section, or using DW_AT_macro_info
attribute to refer to the old .debug_macinfo section. Consumers should
be prepared to handle both of these formats. For description of the
.debug_macinfo section please refer to the DWARF 4 standard.*"
6.3.1
Replace:
"DW_MACINFO_define A macro definition.
DW_MACINFO_undef A macro undefinition.
DW_MACINFO_start_file The start of a new source file inclusion.
DW_MACINFO_end_file The end of the current source file inclusion.
DW_MACINFO_vendor_ext Vendor specific macro information directives."
with
"DW_MACRO_define A macro definition.
DW_MACRO_undef A macro undefinition.
DW_MACRO_start_file The start of a new source file inclusion.
DW_MACRO_end_file The end of the current source file inclusion.
DW_MACRO_define_indirect A macro definition.
DW_MACRO_undef_indirect A macro undefinition.
DW_MACRO_transparent_include Include a sequence of entries from given offset.
DW_MACRO_define_indirectx A macro definition.
DW_MACRO_undef_indirectx A macro undefinition."
6.3.1.1
Replace all "DW_MACINFO_define" occurrences with "DW_MACRO_define" and
all "DW_MACINFO_undef" with "DW_MACRO_undef".
Append:
"All DW_MACRO_define_indirect and DW_MACRO_undef_indirect entries have
two operands. The first operand encodes the line number of the source line
on which the relevant defining or undefining macro directives appeared.
The second operand consists of an offset into a string table contained in
the .debug_str section of the object file. The size of the operand is
given in the section header offset size. Apart from the
encoding of the operands these entries are equivalent to DW_MACRO_define
and DW_MACRO_undef.
All DW_MACRO_define_indirectx and DW_MACRO_undef_indirectx entries have
two operands. The first operand encodes the line number of the source line
on which the relevant defining or undefining macro directives appeared.
The second operand is represented using unsigned LEB128 encoded value,
which is interpreted as zero-based index into an array of offsets in the
.debug_str_offsets section. Apart from the
encoding of the operands these entries are equivalent to DW_MACRO_define
and DW_MACRO_undef."
6.3.1.2
Replace all "DW_MACINFO_start_file" occurrences with "DW_MACRO_start_file".
Append:
"If any DW_MACINFO_start_file entries are present, the header should
contain a reference to .debug_line section."
6.3.1.3
Replace all "DW_MACINFO_end_file" occurrences with "DW_MACRO_end_file".
6.3.1.4
Replace the whole section with:
"6.3.1.4 Transparent inclusion of a sequence of entries
A DW_MACRO_transparent_include entry has one operand, offset into
another part of the .debug_macro section. The size of the operand
is given in the section header offset size. The
DW_MACRO_transparent_include macinfo entry instructs the consumer to replace
it with a sequence of macinfo entries found
after the section header at the given .debug_macro offset, up to, but excluding,
the terminating entry with type code 0. The
DW_MACRO_transparent_include entry type is aimed at sharing duplicate
sequences of macinfo entries between macinfo from different compilation units."
6.3.2
Replace "DW_MACINFO_start_file" with "DW_MACRO_start_file" and
"DW_MACINFO_end_file" with "DW_MACRO_end_file".
6.3.3
Replace "DW_MACINFO_define and DW_MACINFO_undef" with
"DW_MACRO_define, DW_MACRO_define_indirect, DW_MACRO_define_indirectx,
DW_MACRO_undef, DW_MACRO_undef_indirect and DW_MACRO_undef_indirectx"
and replace
"DW_MACINFO_define or DW_MACINFO_undef" with
"DW_MACRO_define, DW_MACRO_define_indirect, DW_MACRO_define_indirectx,
DW_MACRO_undef, DW_MACRO_undef_indirect or DW_MACRO_undef_indirectx".
Replace "DW_MACINFO_start_file" with "DW_MACRO_start_file".
6.3.4
Replace ".debug_macinfo" with ".debug_macro".
7.5.4
Replace ".debug_macinfo section to the first byte" with
".debug_macro section to the header".
Add:
"DW_AT_macros 0xXX macptr"
to figure 20.
7.22
Remove:
"as are the constants in a DW_MACINFO_vendor_ext entry".
Replace figure 39 with:
"Macinfo Type Name Value
DW_MACRO_define 0x01
DW_MACRO_undef 0x02
DW_MACRO_start_file 0x03
DW_MACRO_end_file 0x04
DW_MACRO_define_indirect 0x05
DW_MACRO_undef_indirect 0x06
DW_MACRO_transparent_include 0x07
DW_MACRO_define_indirectx 0x0b
DW_MACRO_undef_indirectx 0x0c
DW_MACRO_lo_user 0xe0
DW_MACRO_hi_user 0xff
Figure 39. Macinfo Type Encodings"
Appendix A
Add "DW_AT_macros" to allowable attributes for "DW_TAG_compile_unit"
and "DW_TAG_partial_unit".
Appendix B
Replace "DW_AT_macinfo" with "DW_AT_macros".
Replace ".debug_macinfo" with ".debug_macro".
Add
( .debug_macro ) -> [ DW_MACRO_transparent_include (n) ] -> ( .debug_macro ) and
( .debug_macro ) -> [ DW_MACRO_define_indirect, DW_MACRO_undef_indirect (o) ] -> ( .debug_str )
( .debug_macro ) -> [ DW_MACRO_start_file (p) ] -> ( .debug_line )
( .debug_macro ) -> [ DW_MACRO_define_indirectx, DW_MACRO_undef_indirectx (q) ] -> ( .debug_str_offsets )
to the picture.
In (i) replace ".debug_macinfo" with ".debug_macro" and "DW_AT_macro_info"
with "DW_AT_macros".
Add:
(n) .debug_macro A macinfo operand of the form DW_FORM_sec_offset is an
offset into another part of the .debug_macro section,
to the section header before the sequence to be
transparently included.
(o) .debug_macro A macinfo operand of the form DW_FORM_strp is an offset
into the .debug_str section of the corresponding string.
(p) .debug_macro DW_MACRO_start_file second operand refers to a file entry
in the .debug_line section, with the .debug_macro header
containing an offset to the start of the referenced
.debug_line section.
(q) .debug_macro A macinfo operand of the form DW_FORM_strx is an index
into the string offset table in the .debug_str_offsets section."
Appendix F
Change
".debug_macinfo - - -"
to
".debug_macinfo - - - x"
Add
".debug_macro x x x 5"
row to the table.
--
3/19/2014 -- Revised.
3/19/2014 -- Accepted.