DWARF Standard


161018.1 Simon Brand DWARF-embedded source for online-compiled programs Enhancement Deferred to Ver. 6

Section , pg 
Programming models such as OpenCL can often have source generated at runtime,
which is compiled online, with its output not written to file. This raises an
issue for the compiler: in the generated DWARF, what should it put as the file
name of the compile unit and associated line table information?

Common solutions to this problem include generating some temporary source file
name and having a contract with the debugger to get the source somehow and write
it out to that file. Since OpenCL and friends generally have quite small source
files, it's quite reasonable to embed the entire source in the binary, then have
the debugger look in a known section or address to extract the source. If there
was a way to express this in DWARF, then runtime-generated source files could 
work without an additional contract between the compiler and debugger. This is
particularly important when dealing with platforms where the filesystem is not 
writable, which is a common situation in mobile computing.

Here is a proposal for embedding the source file in the DWARF string tables and
referencing it from the compile unit DIE and line table. This proposal has been
discussed in the Dwarf-Discuss thread entitled "DWARF and online-compiled programs".

Changes to compile unit sections:
I added a DW_AT_source attribute, which is a string attribute identifying the 
source code. The intention is for implementations to use DW_FORM_strp so that 
the string is held in the .debug_str section and referenced from both the 
compile unit DIE and line table.

Section 3.1.1:
Replace bullet 2 with this:
A DW_AT_name or DW_AT_source attribute identifying the primary source from 
which the compilation unit was derived. If a DW_AT_name attribute is used, 
its value is a null-terminated string containing the full or relative path 
name of the source file. If a DW_AT_source attribute is used, its value is 
a null-terminated string containing the full contents of the source code 
from which the compilation unit was derived is stored. The source code 
string is UTF-8 encoded and encodes line endings with `\n`.

Figure 2:
Add DW_AT_source, which identifies "Embedded source code".

Figure 20:
Add DW_AT_source, whose class is "string"

Figure 42:
Add DW_AT_source to DW_TAG_compile_unit and DW_TAG_partial_unit entries.


Changes to line table sections:
I have based my modifications off of issue 140724.1. I don't know if this 
has since been modified, so there may be some inconsistencies.

These changes are a bit more complex, as there is currently the assumption 
that a given .debug_line section will only have a single file_name_entry_format.
This would not support having a mix of usual source files and source-in-memory
in the same program.

One solution would be to add the concept of a file name entry set, of which 
there can be more than one in a given header, and each can have its own 
file_name_entry_format. The header would contain a field specifying the 
number of file_name_entry_sets, then fields 17-21 would be repeated for 
each set. Another possibility would be to encode the sets in the same 
file_name_entry_format and file_names fields, but specify the sizes of 
each set. This is not quite as clear, but it seems desirable to avoid 
repeating the fields. I've sketched out the second option below.

    Field  Field Name                             Value(s)
    1        Same as in Version 4                 ...
    2        version                              5
    3        Not present in Version 4  -
    4        Not present in Version 4  -
    5-12   Same as in Version 4                   ...
    13      directory_entry_format_count          1
    14      directory_entry_format                DW_LNCT_path, DW_FORM_string
    15      directories_count                     <n+1>
    16      directories                           <n+1>*<null terminated string>
    17      file_name_entry_set_count             2
    18      file_name_entry_format_set_counts     4,2
    19      file_name_entry_format                DW_LNCT_path, DW_FORM_string,
                                                  DW_LNCT_directory_index, DW_FORM_udata,
                                                  DW_LNCT_timestamp, DW_FORM_udata,
                                                  DW_LNCT_size, DW_FORM_udata,
                                                  DW_LNCT_source, DW_FORM_strp,
                                                  DW_LNCT_size, DW_FORM_udata
    20      file_name_set_count                   <m>, <n>
    21      file_names                            <m>*{<null terminated string>,
                                                  <index>, <timestamp>, <size>},
                                                  <n>*{<source offset>, <size>}

Section 6.2.4:
Add bullets after "16. directories"
17. file_name_entry_set_count (ubyte)
A count of the number of file name entry sets that occur in the following 
fields. If this field is zero, then the file_name_entry_format_set_sizes 
field (see below) must also be zero.

18. file_name_entry_format_set_counts (sequence of ubytes)
A sequence of counts of the number of entry formats for each file name 
entry set.

Add bullet after "5. DW_LNCT_MD5"
6. DW_LNCT_source
   The component is an offset into the .debug_str section where a 
null-terminated string contains the source code from which the 
compilation unit was derived. The string is UTF-8 encoded and encodes 
line endings using '\n'. Only one of DW_LNCT_path and DW_LNCT_source 
will be specified for a given file_name_entry_format. This content 
code is paired with the form DW_FORM_strp.

Append paragraph to bullet 1:
Only one of DW_LNCT_path and DW_LNCT_source will be specified for a 
given file_name_entry_format.

Add paragraph after the first paragraph of bullet 2:
The index is 0 if the source is identified by a memory location.

Table 7.25:
Add DW_LNCT_source 0x6 to the table

The description for DW_LNE_define_file will also need updating with 
similar text.


All logos and trademarks in this site are property of their respective owner.
The comments are property of their posters, all the rest © 2007-2017 by DWARF Standards Committee.