Issue 240320.1: Add Local and Indirect Strings to Name Index
| Author: | Cary Coutant |
|---|---|
| Champion: | |
| Date submitted: | 2024-03-20 |
| Date revised: | 2024-09-23 |
| Date closed: | 2024-09-30 |
| Type: | Enhancement |
| Status: | Accepted |
| DWARF version: | 6 |
Background
The name table in the .debug_names section currently references strings
via offsets to the .debug_str section. Each string offset, therefore,
requires a relocation unless some under-the-table agreement is made
between the compiler and linker. This can get expensive, as the number
of names can be quite large.
For non-split DWARF, the use of .debug_str makes sense, as many or most
of the names in the name index are also referenced from entries in the
.debug_info section. If the .debug_info section is using DW_FORM_strx
and a .debug_str_offsets section, however, it would also be beneficial
for the name index to use string references in the style of
DW_FORM_strx, which would eliminate the need for any relocations in the
name table.
For split DWARF, there is little overlap between the strings referenced
by the skeleton .debug_info section and those referenced by the
.debug_names section. When using a linker that can combine
.debug_names sections (and in so doing, eliminate duplicate strings in
the local string pool), there is no advantage gained by storing the
strings in the separate section.
Overview
This proposal adds two alternative string representations to the name
index: one in the style of DW_FORM_strx for non-split DWARF compilation
units that use .debug_str_offsets, and one with a local string table
stored directly in the .debug_names section that requires no
relocations.
Proposed Changes
In Section 6.1.1.2 Structure of the Name Index, change "eight individual parts" to "nine individual parts," and add the following to the enumerated list of parts between items 6 and 7:
7. An optional local string pool.
(Renumber the last two items.)
In Figure 6.1 Name Index Layout (part 1), under "Name Index", add a box for "Local String Pool" between "Name Table" and "Abbrev Table". The box expands to a box on the right labeled "Strings".
Also in Figure 6.1 Name Index Layout (part 1), change "String Offsets" to "String Pointers or Indexes" in the expansion of "Name Table".
In Section 6.1.1.4.1 Section Header, replace field 3 ("padding") with the following:
3.
str_format(ubyte)An enumerated constant that specifies the representation of string references in the name index. The possible values are:
DW_FORM_strp,DW_FORM_strp8, andDW_FORM_strx4(see Section 7.5.5 Classes and Forms).4. padding (ubyte)
Reserved to DWARF (must be zero).
Renumber fields 4-8, and add the following fields after "name_count":
10.
local_str_pool_size(section length)Size of the local string pool. If this value is non-zero, string offsets (when str_format is
DW_FORM_strporDW_FORM_strp8) reference the local string pool. If this value is 0, string offsets reference the .debug_str section. Ifstr_formatisDW_FORM_strx4, this field should be 0.11.
str_offsets(section offset)A 4-byte or 8-byte unsigned offset that points to the header of the compilation unit’s contribution to the
.debug_str_offsetssection. Indirect string references (when str_format isDW_FORM_strx4) are interpreted as zero-based indexes into the array of offsets following the header. Ifstr_formatisDW_FORM_strporDW_FORM_strp8, this field should be 0.
In Section 6.1.1.4.6 Name Table, replace the first paragraph with the following:
The name table immediately follows the hash lookup table. It consists of two arrays: an array of string pointers or indexes, followed immediately by an array of entry offsets. The items in the first array are determined by the
str_formatfield in the section header, and may be 4-byte or 8-byte offsets into either the.debug_strsection or the local string pool, or 4-byte indexes into the array of offsets in the.debug_str_offsetssection. The items in the second array are section offsets: 4-byte unsigned integers for the DWARF-32 format or 8-byte unsigned integers for the DWARF-64 format. The entry offsets in the second array refer to index entries, and are relative to the start of the entry pool area.
Following Section 6.1.1.4.6, add a new section:
Section 6.1.1.4.7 Local String Pool
The local string pool, if present, immediately follows the name table. It consists of a series of null-terminated strings. Its size is given by
local_str_pool_size.[non-normative] For non-split DWARF compilation units, strings used by the name table will have significant overlap with strings used by the
.debug_infosection, and a local string pool is not advisable. Relocations for the string references may be minimized by using the indirect string forms in both.debug_infoand.debug_names. For split DWARF compilation units with a linker that is aware of and can combine.debug_namessections into a single per-module index, there is likely little overlap, and relocations for string references in the name table can be minimized by using the local string pool. If the linker simply concatenates the per-CU indexes, however, it remains beneficial to use indirect string forms and a separate string table.
In Section 6.1.1.4.8 Abbreviations Table (was 6.1.1.4.7), change the first sentence to:
The abbreviations table immediately follows the local string pool (or, if the local string pool is absent, the name table).
In Section 7.4, 32-Bit and 64-Bit DWARF Formats, change the second paragraph
to add .debug_names as an exception to the rule against mixing formats:
The 32-bit and 64-bit DWARF format conventions must not be intermixed within a single compilation unit, except for contributions to the
.debug_str_offsets,.debug_str_offsets.dwo, and.debug_namessections.
In Appendix B, Figure B.1, add an arc from .debug_names to .debug_str
and to .debug_str_offsets. (There should have already been an arc to
.debug_str in the DWARF 5 spec.)
In Figure B.2, add similar arcs.
2024-07-05: Revised to allow for 64-bit string pool length; added text clarifying when a local string pool should be used.
2024-09-23: Revised to allow mixing 32-bit and 64-bit contributions to
.debug_names section.
2024-10-30: Accepted.