Issue 240320.1: Add Local and Indirect Strings to Name Index

Author: Cary Coutant
Champion:
Date submitted: 2024-03-20
Date revised:
Date closed:
Type: Enhancement
Status: Open
DWARF Version: 6

Background

The name table in the .debug_names section currently references strings via offsets to the .debug_str section. Each string offset, therefore, requires a relocation unless some under-the-table agreement is made between the compiler and linker. This can get expensive, as the number of names can be quite large.

For non-split DWARF, the use of .debug_str makes sense, as many or most of the names in the name index are also referenced from entries in the .debug_info section. If the .debug_info section is using DW_FORM_strx and a .debug_str_offsets section, however, it would also be beneficial for the name index to use string references in the style of DW_FORM_strx, which would eliminate the need for any relocations in the name table.

For split DWARF, there is little overlap between the strings referenced by the skeleton .debug_info section and those referenced by the .debug_names section, and there is no advantage gained by storing the strings in the separate section.

Overview

This proposal adds two alternative string representations to the name index: one in the style of DW_FORM_strx for non-split DWARF compilation units that use .debug_str_offsets, and one with a local string table stored directly in the .debug_names section that requires no relocations.

Proposed Changes

In Section 6.1.1.2 Structure of the Name Index, change "eight individual parts" to "nine individual parts," and add the following to the enumerated list of parts between items 6 and 7:

7. An optional local string pool.

(Renumber the last two items.)

In Figure 6.1 Name Index Layout (part 1), under "Name Index", add a box for "Local String Pool" between "Name Table" and "Abbrev Table". The box expands to a box on the right labeled "Strings".

Also in Figure 6.1 Name Index Layout (part 1), change "String Offsets" to "String Pointers or Indexes" in the expansion of "Name Table".

In Section 6.1.1.4.1 Section Header, replace field 3 ("padding") with the following:

3. str_format (ubyte)

An enumerated constant that specifies the representation of string references in the name index. The possible values are: DW_FORM_strp, DW_FORM_strp8, and DW_FORM_strx4 (see Section 7.5.5 Classes and Forms).

4. padding (ubyte)

Reserved to DWARF (must be zero).

Renumber fields 4-8, and add the following fields after "name_count":

10. local_str_pool_size (uword)

Size of the local string pool. If this value is non-zero, string offsets (when str_format is DW_FORM_strp or DW_FORM_strp8) reference the local string pool. If this value is 0, string offsets reference the .debug_str section. If str_format is DW_FORM_strx4, this field should be 0.

11. str_offsets (section offset)

A 4-byte or 8-byte unsigned offset that points to the header of the compilation unit’s contribution to the .debug_str_offsets section. Indirect string references (when str_format is DW_FORM_strx4) are interpreted as zero-based indexes into the array of offsets following the header. If str_format is DW_FORM_strp or DW_FORM_strp8, this field should be 0.

In Section 6.1.1.4.6 Name Table, replace the first paragraph with the following:

The name table immediately follows the hash lookup table. It consists of two arrays: an array of string pointers or indexes, followed immediately by an array of entry offsets. The items in the first array are determined by the str_format field in the section header, and may be 4-byte or 8-byte offsets into either the .debug_str section or the local string pool, or 4-byte indexes into the array of offsets in the .debug_str_offsets section. The items in the second array are section offsets: 4-byte unsigned integers for the DWARF-32 format or 8-byte unsigned integers for the DWARF-64 format. The entry offsets in the second array refer to index entries, and are relative to the start of the entry pool area.

Following Section 6.1.1.4.6, add a new section:

Section 6.1.1.4.7 Local String Pool

The local string pool, if present, immediately follows the name table. It consists of a series of null-terminated strings. Its size is given by local_str_pool_size.

[non-normative] For non-split DWARF compilation units, strings used by the name table will have significant overlap with strings used by the .debug_info section, and a local string pool is not advisable. Relocations for the string references may be minimized by using the indirect string forms in both .debug_info and .debug_names. For split DWARF compilation units, there is likely little overlap, and relocations for string references in the name table can be minimized by using the local string pool.

In Section 6.1.1.4.8 Abbreviations Table (was 6.1.1.4.7), change the first sentence to:

The abbreviations table immediately follows the local string pool (or, if the local string pool is absent, the name table).

In Appendix B, Figure B.1, add an arc from .debug_names to .debug_str and to .debug_str_offsets. (There should have already been an arc to .debug_str in the DWARF 5 spec.)

In Figure B.2, add similar arcs.