Issue 090402.1: DWARF support for overlays

Author: John DelSignore
Champion: John DelSignore
Date submitted: 2009-04-02
Date revised:
Date closed:
Type: Clarification
Status: Rejected
DWARF Version: 4
Does DWARF have support for overlays?

I think the answer is no, but I wanted to double check and solicit 
suggestions for workarounds or changes to the DWARF standard to support 
overlays.

Overlays occur in the SPU processor of the Cell Broadband Engine (CBE).
Wikipedia has a overview of overlays here <a href="http://en.wikipedia.org/wiki/Overlay_(programming)">en.wikipedia.org/wiki/Overlay_(programming)</a> 
and the binutils ld documentation talks about overlays here 
<a href="http://sourceware.org/binutils/docs/ld/Overlay-Description.html">sourceware.org/binutils/docs/ld/Overlay-Description.html</a>.

So, the question is this... Is there a way to determine the program 
segment to which a particular DWARF DIE or CIE/FDE belongs, when 
multiple program segments are linked at the same address due to SPU 
overlays?

Here is the problem, as seen with a simple SPU overlay example, which 
I took from the Cell SDK and changed slightly. The source code inside
in the overlaid sections looks like this:

------------------------------cut-here------------------------------
% more olay*/test*.c
::::::::::::::
olay1/test1.c
::::::::::::::
/* --------------------------------------------------------------- */
/* (C)Copyright 2006                                               */
/* International Business Machines Corporation,                    */
/* All Rights Reserved.                                            */
/*                                                                 */
/* This program is made available under the terms of the           */
/* Common Public License v1.0 which accompanies this distribution. */
/* --------------------------------------------------------------- */
/* PROLOG END TAG zYx                                              */
#include &lt;stdio.h&gt;

int o1_test1(int p)
{
        printf("o1_test1 prints %d\n", p);
        return 1;
}

int o1_test2(int p)
{
        printf("o1_test2 prints %d\n", p);
        return 2;
}
::::::::::::::
olay2/test2.c
::::::::::::::
/* --------------------------------------------------------------- */
/* (C)Copyright 2006                                               */
/* International Business Machines Corporation,                    */
/* All Rights Reserved.                                            */
/*                                                                 */
/* This program is made available under the terms of the           */
/* Common Public License v1.0 which accompanies this distribution. */
/* --------------------------------------------------------------- */
/* PROLOG END TAG zYx                                              */
#include &lt;stdio.h&gt;

int o2_test1(int p)
{
        printf("o2_test1 prints %d\n", p);
        return 11;
}

int o2_test2(int p)
{
        printf("o2_test2 prints %d\n", p);
        return 12;
}
------------------------------cut-here------------------------------

When the above SPU code is linked, multiple program segments are linked 
at the same virtual address (overlaid). The ELF program header for a 
sample program looks like this for the SPU image:

------------------------------cut-here------------------------------
% spu-readelf -lS spu_main
There are 26 section headers, starting at offset 0x792c:

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .init             PROGBITS        00000000 000100 000024 00  AX  0   0  4
  [ 2] .text             PROGBITS        00000030 000130 000ad0 00 WAX  0   0 16
  [ 3] .segment1         PROGBITS        00000b00 000c00 000080 00  AX  0   0 16
  [ 4] .segment2         PROGBITS        00000b00 000c80 000080 00  AX  0   0 16
  [ 5] .fini             PROGBITS        00000b80 000d00 00001c 00  AX  0   0  4
  [ 6] .rodata           PROGBITS        00000ba0 000d20 0000d0 00   A  0   0 16
  [ 7] .ctors            PROGBITS        00000c80 000e00 000008 00  WA  0   0  4
  [ 8] .dtors            PROGBITS        00000c88 000e08 000008 00  WA  0   0  4
  [ 9] .jcr              PROGBITS        00000c90 000e10 000004 00  WA  0   0  4
  [10] .data             PROGBITS        00000ca0 000e20 000454 00  WA  0   0 16
  [11] .bss              NOBITS          00001100 001274 000010 00  WA  0   0 16
  [12] .comment          PROGBITS        00000000 001274 0000fc 00      0   0  1
  [13] .debug_aranges    PROGBITS        00000000 001370 000190 00      0   0  1
  [14] .debug_pubnames   PROGBITS        00000000 001500 0001b0 00      0   0  1
  [15] .debug_info       PROGBITS        00000000 0016b0 003eb5 00      0   0  1
  [16] .debug_abbrev     PROGBITS        00000000 005565 000b79 00      0   0  1
  [17] .debug_line       PROGBITS        00000000 0060de 000954 00      0   0  1
  [18] .debug_frame      PROGBITS        00000000 006a34 0001f0 00      0   0  4
  [19] .debug_str        PROGBITS        00000000 006c24 0008a6 00      0   0  1
  [20] .debug_loc        PROGBITS        00000000 0074ca 000351 00      0   0  1
  [21] .note.spu_name    PROGBITS        00000000 007820 000020 00      0   0 16
  [22] .toe              NOBITS          00001180 001270 000010 00  WA  0   0 16
  [23] .shstrtab         STRTAB          00000000 007840 0000ec 00      0   0  1
  [24] .symtab           SYMTAB          00000000 007d3c 000670 10     25  66  4
  [25] .strtab           STRTAB          00000000 0083ac 000472 00      0   0  1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings)
  I (info), L (link order), G (group), x (unknown)
  O (extra OS processing required) o (OS specific), p (processor specific)

Elf file type is EXEC (Executable file)
Entry point 0x30
There are 6 program headers, starting at offset 52

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x000100 0x00000000 0x00000000 0x00b00 0x00b00 RWE 0x10
  LOAD           0x000c00 0x00000b00 0x00000b00 0x00080 0x00080 R E 0x10
  LOAD           0x000c80 0x00000b00 0x00000b80 0x00080 0x00080 R E 0x10
  LOAD           0x000d00 0x00000b80 0x00000c00 0x00580 0x00590 RWE 0x10
  LOAD           0x001270 0x00001180 0x00001200 0x00000 0x00010 RW  0x10
  NOTE           0x007820 0x00000000 0x00000000 0x00020 0x00020 R   0x10

 Section to Segment mapping:
  Segment Sections...
   00     .init .text 
   01     .segment1 .segment2 
   02     .segment1 .segment2 
   03     .fini .rodata .ctors .dtors .jcr .data .bss 
   04     .toe 
   05     .note.spu_name 
------------------------------cut-here------------------------------

Notice that .segment1 and .segment2 are both at virtual address
0x00000b00, but their file offsets are different (0x000c00 and 
0x000c80, respectively). At runtime, the overlay manager arranges 
to load the correct segment into memory as the application executes.

Loader symbols that are associated with a particular program segment 
are defined together in a single file section, and the ELF file 
section header includes a "phoff" field, which, as we understand 
it, indicates the offset within the executable file of the associated
program segment.  For example:

------------------------------cut-here------------------------------
% readelf -s spu_main | egrep Num\|_test
   Num:    Value  Size Type    Bind   Vis      Ndx Name
    66: 00000b00    60 FUNC    GLOBAL DEFAULT    3 o1_test1
    70: 00000b40    60 FUNC    GLOBAL DEFAULT    4 o2_test2
    86: 00000b40    60 FUNC    GLOBAL DEFAULT    3 o1_test2
    92: 00000b00    60 FUNC    GLOBAL DEFAULT    4 o2_test1
% 
------------------------------cut-here------------------------------

We can take the section index (in the Ndx column), get the section 
file offset, and then lookup the program header. That makes it 
straightforward to determine the overlay segment to which a particular
loader symbol belongs.

However, the DWARF information does not seem to contain any indication
of the program segment to which a particular symbol belongs. Only the
link address of the symbol is provided in the DWARF information, and
when multiple segments are linked at the same address, that's ambiguous.
That seems to make it difficult or impossible to determine which overlay
a particular DWARF symbol belongs to. For example, here are some DWARF 
DIEs for the sample program functions o1_test1() and o2_test1(), both of
which are at virtual address 0x00000b00:

------------------------------cut-here------------------------------
  Compilation Unit @ offset 0x18d:
   Length:        399
   Version:       2
   Abbrev Offset: 95
   Pointer Size:  4
 &lt;0&gt;&lt;198&gt;: Abbrev Number: 1 (DW_TAG_compile_unit)
  &lt;199&gt;     DW_AT_stmt_list   : 0x5c  
  &lt;19d&gt;     DW_AT_high_pc     : 0xb7c 
  &lt;1a1&gt;     DW_AT_low_pc      : 0xb00 
  &lt;1a5&gt;     DW_AT_producer    : GNU C 4.1.1   
  &lt;1b1&gt;     DW_AT_language    : 1 (ANSI C)
  &lt;1b2&gt;     DW_AT_name        : test1.c   
  &lt;1ba&gt;     DW_AT_comp_dir    : /.../overlay/simple/spu_olay/olay1    
...
 &lt;1&gt;&lt;2c5&gt;: Abbrev Number: 4 (DW_TAG_subprogram)
  &lt;2c6&gt;     DW_AT_sibling     : &lt;2f4&gt;   
  &lt;2ca&gt;     DW_AT_external    : 1 
  &lt;2cb&gt;     DW_AT_name        : o1_test1  
  &lt;2d4&gt;     DW_AT_decl_file   : 1 
  &lt;2d5&gt;     DW_AT_decl_line   : 13    
  &lt;2d6&gt;     DW_AT_prototyped  : 1 
  &lt;2d7&gt;     DW_AT_type        : &lt;23b&gt;   
  &lt;2db&gt;     DW_AT_low_pc      : 0xb00 
  &lt;2df&gt;     DW_AT_high_pc     : 0xb3c 
  &lt;2e3&gt;     DW_AT_frame_base  : 0x20  (location list)
...
  Compilation Unit @ offset 0x320:
   Length:        399
   Version:       2
   Abbrev Offset: 200
   Pointer Size:  4
 &lt;0&gt;&lt;32b&gt;: Abbrev Number: 1 (DW_TAG_compile_unit)
  &lt;32c&gt;     DW_AT_stmt_list   : 0x9d  
  &lt;330&gt;     DW_AT_high_pc     : 0xb7c 
  &lt;334&gt;     DW_AT_low_pc      : 0xb00 
  &lt;338&gt;     DW_AT_producer    : GNU C 4.1.1   
  &lt;344&gt;     DW_AT_language    : 1 (ANSI C)
  &lt;345&gt;     DW_AT_name        : test2.c   
  &lt;34d&gt;     DW_AT_comp_dir    : /.../overlay/simple/spu_olay/olay2    
...
 &lt;1&gt;&lt;458&gt;: Abbrev Number: 4 (DW_TAG_subprogram)
  &lt;459&gt;     DW_AT_sibling     : &lt;487&gt;   
  &lt;45d&gt;     DW_AT_external    : 1 
  &lt;45e&gt;     DW_AT_name        : o2_test1  
  &lt;467&gt;     DW_AT_decl_file   : 1 
  &lt;468&gt;     DW_AT_decl_line   : 13    
  &lt;469&gt;     DW_AT_prototyped  : 1 
  &lt;46a&gt;     DW_AT_type        : &lt;3ce&gt;   
  &lt;46e&gt;     DW_AT_low_pc      : 0xb00 
  &lt;472&gt;     DW_AT_high_pc     : 0xb3c 
  &lt;476&gt;     DW_AT_frame_base  : 0x5e  (location list)
------------------------------cut-here------------------------------

The DWARF frame information has a similar problem:

------------------------------cut-here------------------------------
...
00000028 0000000c ffffffff CIE
  Version:               1
  Augmentation:          ""
  Code alignment factor: 1
  Data alignment factor: -16
  Return address column: 0

  DW_CFA_def_cfa: r1 ofs 0

00000038 00000014 00000028 FDE cie=00000028 pc=00000b00..00000b3c
  DW_CFA_advance_loc: 12 to 00000b0c
  DW_CFA_def_cfa_offset: 48
  DW_CFA_offset_extended_sf: r0 at cfa+16
  DW_CFA_nop
  DW_CFA_nop

00000050 00000014 00000028 FDE cie=00000028 pc=00000b40..00000b7c
  DW_CFA_advance_loc: 12 to 00000b4c
  DW_CFA_def_cfa_offset: 48
  DW_CFA_offset_extended_sf: r0 at cfa+16
  DW_CFA_nop
  DW_CFA_nop

00000068 0000000c ffffffff CIE
  Version:               1
  Augmentation:          ""
  Code alignment factor: 1
  Data alignment factor: -16
  Return address column: 0

  DW_CFA_def_cfa: r1 ofs 0

00000078 00000014 00000068 FDE cie=00000068 pc=00000b00..00000b3c
  DW_CFA_advance_loc: 12 to 00000b0c
  DW_CFA_def_cfa_offset: 48
  DW_CFA_offset_extended_sf: r0 at cfa+16
  DW_CFA_nop
  DW_CFA_nop

00000090 00000014 00000068 FDE cie=00000068 pc=00000b40..00000b7c
  DW_CFA_advance_loc: 12 to 00000b4c
  DW_CFA_def_cfa_offset: 48
  DW_CFA_offset_extended_sf: r0 at cfa+16
  DW_CFA_nop
  DW_CFA_nop
------------------------------cut-here------------------------------

Notice that the FDE pc ranges overlap

Without the connection between the DWARF symbol and the segment, it's 
difficult or impossible to manage breakpoints within and properly back 
trace out of code which is in an overlay. The reason for this is that
we want to be able to take each link address in the executable file
and relocate it into separate provisional address space. Each program
object (subroutine, variable, etc.) lives at a unique location in the
provisional address space, which allows a unique mapping between
provisional addresses and symbols.

What we need is a way to map an address in a DWARF DIE or CIE/FDE to
a particular section or program segment so that we can determine the
proper provisional relocation value.

Does such a mapping exist?

When I posed these question to the CBE OSS DEV mailing list
&lt;cbe-oss-dev@ozlabs.org&gt;, Ulrich Weigand was kind enough to reply,
and described GDB does for overlays:

Regarding the DIE mapping, he replied:

&gt; &gt; That's true.  What GDB does is to use the symbol name from DWARF
&gt; &gt; symbol information, and go back to the loader symbol data for the
&gt; &gt; symbol with the same name, and use the section index from there
&gt; &gt; to determine the overlay where the symbol resides.

Regarding the FDE mapping, he replied:

&gt; &gt; That's also correct.  What GDB does is to fall back to prologue
&gt; &gt; analysis for functions in overlay sections ... this is clearly not
&gt; &gt; ideal, but probably better than nothing.

Regarding the existence of a DWARF address to segment map, he replied:

&gt; &gt; Unfortunately, there is no easy way to do that right now. If special 
&gt; &gt; builds are acceptable, you might build your SPU executable with the
&gt; &gt; --emit-relocs linker option. This will keep relocation information in
&gt; &gt; the final executable.
&gt; &gt; 
&gt; &gt; You could then examine the reloc data to find out the symbols that
&gt; &gt; were used to detemine the final addresses in DWARF DIE and CIE/FDE
&gt; &gt; sections. Once you have the symbol, you can go back to the section
&gt; &gt; index that is kept in the symbol table to determine the overlay
&gt; &gt; region to use.
&gt; &gt; 
&gt; &gt; Otherwise, I'm not sure there's anything better than the tricks GDB 
&gt; &gt; does use to handle overlays ...

So, can we do better?

--

Recommend that support for overlays in DWARF be explored as a vendor
extension for later submission as a proposal.