|
DWARF Standard
|
211108.2 |
Cary Coutant |
Allow Non-Uniform Record Formats in the File Name Table |
Enhancement |
Accepted with modifications |
Cary Coutant |
Section 6.2.4, pg 156
Proposal to Allow Non-Uniform Record Formats in the File Name Table
Background
----------
In Issue 180201.1 ("DWARF and source text embedding"), a mechanism was
proposed for embedding source text in the line table, and for making the
MD5 component optional, so that a line table may have a mixture of file
name entries with and without MD5 checksums. The committee decided to
split the second part of that proposal into a separate issue, noting
that the use of a separate boolean field (DW_LNCT_is_MD5) was perhaps
not the best approach to the problem.
Alternatives suggested were:
(1) Reserve the value 0 for an absent checksum, so that no separate flag
is required. The odds of a file having a 0 checksum are no greater than
that of a file's checksum colliding with that of another. We could
simply designate that a file whose checksum is 0 would be written with a
checksum of 1.
(2) Generalize the problem so that each file name entry in the line
number program header could have a custom format rather than impose a
uniform format on all file name entries.
This proposal describes the second alternative.
New Directory and File Name Entry Format
----------------------------------------
The line number program header contains two format tables, which
describe the record formats used by the directory and file name tables.
Each format table contains one or more format descriptors. The number of
format descriptors in each table is given by a separate field in the
program header.
Each format descriptor consists of a sequence of pairs, each consisting
of a content type code and a form, both represented as unsigned LEB128
values, terminated by a (0,0) pair.
Each directory and file name entry begins with a format code, which is
the index of a format descriptor in the corresponding format table. The
format code is followed by a sequence of fields as described by the
indicated format descriptor.
The content type codes, and the form codes that may be paired with each
content type code, are as given in Section 6.2.4.1 ("Standard Content
Descriptions").
Proposed Changes to the DWARF Specification
-------------------------------------------
The line number program header contains the following fields, which
replace items 13 through 20 in Section 6.2.4 of the DWARF 5
specification:
13. directory_format_count (ubyte)
A 1-byte unsigned integer containing the number of format
descriptors in the directory format table.
14. directory_format_table (sequence of record format descriptors)
A sequence of record format descriptors. Each descriptor consists
of the following:
* A sequence of field descriptors. Each field descriptor consists
of a pair of unsigned LEB128 values: (a) a content type code (see
Sections 6.2.4.1 and 6.2.4.2), and (b) a form code (using the
attribute form codes).
* A pair of zero bytes to terminate the declaration.
The line number program numbers the record format descriptors
sequentially, beginning with 0.
The format declarations describe the layout of the entries
in the directories table, below.
15. directories_count (ULEB128)
A count of the number of directory entries in the directory table.
16. directories (sequence of directory entries)
A sequence of directory entries. Each entry consists of:
* A format code (ubyte), which selects a record format descriptor
from the directory format table, above, by its index.
* A sequence of fields as described by the selected record format
descriptor.
Each directory entry describes a path that was searched for included
source files in this compilation, including the compilation
directory of the compilation. (The paths include those directories
specified by the user for the compiler to search and those the
compiler searches without explicit direction.)
The first directory entry is the current directory of the
compilation. Each additional path entry is either a full path name
or is relative to the current directory of the compilation.
The line number program assigns a number (index) to each of the
directory entries in order, beginning with 0.
[Keep the non-normative text from the DWARF 5 document.]
17. file_name_format_count (ubyte)
A 1-byte unsigned integer containing the number of format
descriptors in the file name format table.
18. file_name_format_table (sequence of record format descriptors)
A sequence of record format descriptors. Each descriptor consists
of the following:
* A sequence of field descriptors. Each field descriptor consists
of a pair of unsigned LEB128 values: (a) a content type code (see
Sections 6.2.4.1 and 6.2.4.2), and (b) a form code (using the
attribute form codes).
* A pair of zero bytes to terminate the declaration.
The line number program numbers the record format descriptors
sequentially, beginning with 0.
The format declarations describe the layout of the entries
in the file names table, below.
19. file_names_count (ULEB128)
A count of the number of file name entries in the file name table.
20. file_names (sequence of file name entries)
A sequence of file name entries. Each entry consists of:
* A format code (ubyte), which selects a record format descriptor
from the file name format table, above, by its index.
* A sequence of fields as described by the selected record format
descriptor.
Each file name entry describes a source file that contributes to the
line number information for this compilation unit, or is used in
other contexts, such as in a declaration coordinate or a macro file
inclusion.
The first file name entry is the primary source file, whose file
name exactly matches that given in the DW_AT_name attribute in the
compilation unit debugging information entry.
The line number program references file names in this sequence
beginning with 0, and uses those numbers instead of file names in
the line number program that follows.
[Keep the non-normative text from the DWARF 5 document.]
If the directory format table or the file name format table contains
exactly one item, and all the field descriptors use form codes with a
fixed-length representation, a consumer may be able to optimize its
access to the directory table or file name table, respectively, by
computing the entry offset from the entry index.
--
In Section 7.22, on page 236, change "The version number in the line
number program header is 5" to "... 6".
In Appendix G, Table G.1, on page 416, enter "6" for ".debug_line", in
the new column for DWARF V6.
--
2022-07-05: Revised: Previous version: http://dwarfstd.org/issues/211108.2-1.html
2022-07-25: Accepted with modifications:
Change ubyte to ULEB in 13, 16, 17, and 20.
Remove non-normative paragraph after 20.