Issue 060615.1: Address size clarification

Stephane Chauveau

Section: 7.21
Page: 155

I am implementing DWARF2/3 for some architectures with segmented
memories (in fact some single/dual harward architectures and we
use the segment to encode the memory and paging characteristics
of each address).
The part I find unclear is how to figure out the exact byte size
of each segmented address (in a tupple). This information is
required when writting in a generic DWARF2/3 parser that does
not know all the details of each architecure.

On page 155 of the DWARF3 document:

> >  4. address_size (ubyte)
> >       A 1-byte unsigned integer containing the size in bytes of an
> >       address (or the offset portion of an address for segmented
> >       addressing) on the target system.
> >  5. segment_size (ubyte)
> >       A 1-byte unsigned integer containing the size in bytes of a
> >       segment descriptor on the target system. This header is
> >       followed by a series of tuples.
> >
> > Each tuple consists of an address and a length, each in the size
> > appropriate for an address on the target architecture. The first
> > tuple following the header in each set begins at an offset that
> > is a multiple of the size of a single tuple (that is, twice the
> > size of an address).

The definition of 'address_size' and 'segment_size' seems to imply
that the byte size of an address is 'address_size+segment_size'.

However, the next paragraph defines it as 'the size appropriate
for an address on the target architecture' which would make it
architecture dependant (and potentially, higher
than address_size+segment_size).

A typical example, could be an architecture using 16bit addresses
with a 8bit segment but storing a segmented address in 32bit.

It also make sense to enforce that that the physical size of
an address in the tupple should be a power of 2. This is
likely to be required anyways because of the alignment
constraints of the .debug_aranges sections (at least in ELF).

With that requirement, the ambiguity about the size of an
address in a tupple could be lifted by stating that

 (proposal 1) Each byte size of an address in a tupple is exactly
   'address_size+segment_size' bytes. That number should be a
   power of two.

 (proposal 2) The byte size of an address in a tupple is
   'address_size+segment_size' bytes rounded up to the closest
   power of two.

It should be noted that the exact encoding of a segmented
address is not fully specified. In practice, that could
mean that the definition of '0' for the end markers is also
not fully specified. For example, there are multiple ways to represent
the flat address 0 using an intel 16bit+16bit segmented addresses.
A safer definition for the end marker could be "a tupple with all
bytes set to 0".

------------------

The intent of the DWARF standard is that the tuple quoted from page
155 should be <segment, address, length> to be consistent with
usage elsewhere in the document.  The use of "address" on a segmented
architecture indicates the offset within the segment.  There are no
places where "address" is intended to mean <segment,offset>.  

The size of the tuples of <segment, address, length> do not need to
be a power of two. 

Author:	Stephane Chauveau
Champion:
Date submitted:	2006-06-15
Date revised:
Date closed:
Type:	Clarification
Status:	Closed
DWARF Version:	4