DWARF Standard


HOME
SPECIFICATIONS
FAQ
ISSUES



211108.2 Cary Coutant Allow Non-Uniform Record Formats in the File Name Table Enhancement Open Cary Coutant


Section 6.2.4, pg 156
Proposal to Allow Non-Uniform Record Formats in the File Name Table

October 31, 2021
Revised November 8, 2021


Background
----------

In Issue 180201.1 ("DWARF and source text embedding"), a mechanism was
proposed for embedding source text in the line table, and for making the
MD5 component optional, so that a line table may have a mixture of file
name entries with and without MD5 checksums. The committee decided to
split the second part of that proposal into a separate issue, noting
that the use of a separate boolean field (DW_LNCT_is_MD5) was perhaps
not the best approach to the problem.

Alternatives suggested were:

(1) Reserve the value 0 for an absent checksum, so that no separate flag
is required. The odds of a file having a 0 checksum are no greater than
that of a file's checksum colliding with that of another. We could
simply designate that a file whose checksum is 0 would be written with a
checksum of 1.

(2) Generalize the problem so that each file name entry in the line
number program header could have a custom format rather than impose a
uniform format on all file name entries.

This proposal describes the second alternative.


New Directory and File Name Entry Format
----------------------------------------

The line number program header contains an abbreviation table, with a
sequence of abbreviation declarations. The abbreviation table is shared
by the directory table and the file name table.

Each abbreviation declaration consists of:

(1) An abbreviation code.
(2) A sequence of pairs, each consisting of a content type code and a
    form.
(3) A (0,0) pair terminating the declaration.

Each value in the abbreviation declaration is an unsigned LEB128 value.

Each directory and file name entry begins with an abbreviation code
(similar to that used in the DIE representation), followed by a sequence
of components as indicated by the abbreviation code.

The content type codes, and the form codes that may be paired with each
content type code, are as given in Section 6.2.4.1 ("Standard Content
Descriptions").


Proposed Changes to the DWARF Specification
-------------------------------------------

The line number program header contains the following fields, which
replace items 13 through 20 in Section 6.2.4 of the DWARF 5
specification:

13. abbrev_count (ubyte)

    A 1-byte unsigned integer containing the number of abbreviation
    declarations in the abbreviation table.

14. abbrev_table (sequence of abbreviation declarations)

    A sequence of abbreviation declarations. Each abbreviation
    declaration consists of the following:

    * An abbreviation code, a ULEB128 value.
    * A sequence of entry format descriptions. Each description
      consists of a pair of ULEB128 values: (a) a content type code (see
      Sections 6.2.4.1 and 6.2.4.2), and (b) a form code (using the
      attribute form codes).
    * A pair of zero bytes to terminate the declaration.

    The abbreviation declarations describe the layout of the entries
    in both the directories table and the file names table, below.

15. directories_count (ULEB128)

    A count of the number of directory entries in the directory table.

16. directories (sequence of directory entries)

    A sequence of directory entries. Each entry consists of:

    * An abbreviation code, a ULEB128 value
    * A sequence of values as described by the abbreviation declaration
      corresponding to the abbreviation code.

    Each directory entry describes a path that was searched for included
    source files in this compilation, including the compilation
    directory of the compilation. (The paths include those directories
    specified by the user for the compiler to search and those the
    compiler searches without explicit direction.)

    The first directory entry is the current directory of the
    compilation. Each additional path entry is either a full path name
    or is relative to the current directory of the compilation.

    The line number program assigns a number (index) to each of the
    directory entries in order, beginning with 0.

17. file_names_count (ULEB128)

    A count of the number of file name entries in the file name entry
    table.

18. file_names (sequence of file name entries)

    A sequence of file name entries. Each entry consists of:

    * An abbreviation code, a ULEB128 value
    * A sequence of values as described by the abbreviation declaration
      corresponding to the abbreviation code.

    Each file name entry describes a source file that contributes to the
    line number information for this compilation unit, or is used in
    other contexts, such as in a declaration coordinate or a macro file
    inclusion.

    The first file name entry is the primary source file, whose file
    name exactly matches that given in the DW_AT_name attribute in the
    compilation unit debugging information entry.

    The line number program references file names in this sequence
    beginning with 0, and uses those numbers instead of file names in
    the line number program that follows.




All logos and trademarks in this site are property of their respective owner.
The comments are property of their posters, all the rest © 2007-2021 by DWARF Standards Committee.