Issue 230524.1: Location Descriptions on the DWARF Stack

Author: Tony Tye, Cary Coutant
Champion: Cary Coutant
Date submitted: 2023-05-24
Date revised:
Date closed:
Type: Enhancement
Status: Open
DWARF Version: 6

Background

The DWARF 5 concept of location descriptions (Section 2.6) limits their use to cases where the location described is final, and not subject to some further modification, with two exceptions. First, if the location description is a memory location description, it is a simple DWARF expression (Section 2.5) that can be modified by further DWARF expression operators. Second, for any form of location description, it can be offset by a fixed number of bits by using a DW_OP_bit_piece composition operator.

Where do these limitations matter?

Consider the case of a FORTRAN array (as shown in Appendix D in D.2.1) that has been partially promoted to a register or registers. The evaluation of its lower and upper bounds depends on the location of the array as provided by DW_OP_push_object_address. If the array is not entirely in memory, this operation is not able to provide the address of the object, as it can only provide a memory address. If DW_OP_push_object_address were allowed to push a composite location description on the stack, we could apply further operations to locate the bounds of the array.

Similarly, consider the case of a pointer-to-member type in C++, where the object (or part of the object) has been promoted to a register. In this case, DW_AT_use_location is not able to provide the address of the object. In optimized code, it sometimes would need to provide a register location description or a composite location description, but these cannot be pushed onto the DWARF stack. If DW_AT_use_location were allowed to push a composite location description on the stack, we could apply further operations to determine the register location of the member being referenced.

Also consider the case where a DW_OP_call* operator is used to get the location of a variable. If the variable happens to be in a register at the current PC, the call operator cannot succeed, as it cannot push anything but a memory location on the stack.

All of these cases have a common limiting factor: that location descriptions cannot be pushed onto the stack, and subsequently operated on to produce derived location descriptions.

Overview

This proposal removes that limitation. The DWARF stack is extended so that it can hold elements that are either (typed) values or (single) location descriptions. The operators in Section 2.6 that previously defined register and implicit location descriptions are now considered part of a DWARF expression, and are no longer "terminal" in the sense that they cannot be part of a larger expression.

Memory location descriptions and values of the generic type are considered equivalent and interchangeable.

Most existing expression operators defined in Section 2.5 continue to be limited to operating on values only.

The DW_OP_deref* and DW_OP_xderef* operators are extended to operate on any location description, and provide the value contained at that location, whether in memory, in a register, in implicit storage, or a composite value.

The DW_OP_push_object_address operator pushes a location description, which may be a memory address (as before), or a register, implicit storage, or a composite.

The DW_AT_use_location attribute provides an expression used to compute the address of a member for a pointer-to-member type, and expects the evaluation mechanism to provide the value of the pointer and the location of the object as implicitly-pushed elements on the stack. The latter element is now allowed to be any location description.

Two new operators, DW_OP_offset and DW_OP_bit_offset, are introduced that allow a location description on the stack to be modified by a byte or a bit offset.

The composite location description operators, DW_OP_piece and DW_OP_bit_piece, are redefined to build up a composite location description, which is held in the top element of the stack. A new operator, DW_OP_piece_end, is defined for use when a composite location description is complete, and there is a need to continue the expression.

The DW_OP_call* operators are now allowed to leave a location description on the stack.

Proposed Changes

Section 2.5 DWARF Expressions

[Page 26] Change the first paragraph as follows:

They are expressed in terms of DWARF operations] that operate on a stack of values elements. Each element in the stack may be either a value or a location description. Values on the stack are typed, and can represent a value of any supported base type of the target machine. Location descriptions on the stack can represent any of the single location descriptions described in Section 2.6.1.

After the second paragraph, add:

The result of a DWARF expression is the value or location description on the top of the stack after evaluating the operations.

Section 2.5.1 General Operations

[Page 26] Change the first paragraph as follows:

Each general operation represents a postfix operation on a simple stack machine. Each element of the stack has a type and a value, and can represent a value of any supported base type of the target machine.

Section 2.5.1.3 Stack Operations

[Page 30] Under DW_OP_deref, change:

The DW_OP_deref operation pops the top stack entry and treats it as an address a location description. The popped value must have an integral type.

Under DW_OP_deref_size and DW_OP_deref_type, make the same changes.

[Page 32] Under DW_OP_push_object_address, change:

The DW_OP_push_object_address operation pushes the address location description of the object currently being evaluated...

Section 2.5.1.5 Control Flow Operations

[Page 36] Under DW_OP_call2, etc., change:

Execution of the DWARF expression of a DW_AT_location attribute may add to and/or remove pop elements from the stack and/or push values or location descriptions onto the stack. Execution returns to the point following the call when the end of the attribute is reached. Values and location descriptions on the stack at the time of the call may be used as parameters by the called expression, and values and location descriptions left on the stack by the called expression may be used as return values by prior agreement between the calling and called expressions.

Section 2.6.1 Single Location Descriptions

[Page 39] Replace:

2. A composite location description, consisting of one or more simple location descriptions, each of which is followed by one composition operation. formed from simple location descriptions by the composition operations described in Section 2.6.1.2. Each simple location description describes the location of one piece of the object; each composition operation describes which part of the object is located there. Each simple location description that is a DWARF expression is evaluated independently of any others.

Section 2.6.1.1.2 Memory Location Descriptions

[Page 39] Change:

A memory location description consists of a non-empty DWARF expression (see Section 2.5 on page 26), whose value result is the address of a piece or all of an object or other entity in memory.

Add:

A value of integral type may be treated as a memory location description. A memory location description may also be treated as a value of the generic type.

The following DWARF operations can also be used to specify a memory location:

1. DW_OP_addr... [move here from Section 2.5.1.1]

2. DW_OP_addrx... [move here from Section 2.5.1.1]

3. DW_OP_push_object_address... [move here from Section 2.5.1.3]

4. DW_OP_form_tls_address... [move here from Section 2.5.1.3]

5. DW_OP_call_frame_cfa... [move here from Section 2.5.1.3]

Section 2.6.1.1.3 Register Location Descriptions

[page 39] Remove the non-normative text:

A register location description must stand alone as the entire description of an object or a piece of an object.

Section 2.6.1.2 Composite Location Descriptions

[Page 42] Change:

A composite location description describes an object or value which may be contained in part of a register or stored in more than one location. Each piece is described by a composition operation, which does not compute a value nor store any result on the DWARF stack. There may be one or more composition operations in a single composite location description. A series of such operations describes the parts of a value in memory address order. Each composition operation pops a location description from the stack and replaces it with a new partial composite location description on the DWARF stack. If the immediately preceding element on the stack is also a partial composite location description (i.e., it is not the first piece in the series), the two partial composite location descriptions are combined into a single partial composite location description.

Add:

3. DW_OP_piece_end

The DW_OP_piece_end operation terminates a composition operation by converting the partial composite location description on top of the stack to a complete composite location description. This operation is necessary only if the location description is not at the end of the DWARF expression; otherwise, the conversion is implicit.

Section 2.6.1.3 Location Description Operations [new section]

Add:

In addition to the composite operations, location descriptions may be modified by the following operations:

1. DW_OP_offset

DW_OP_offset pops two stack entries. The first (top of stack) must be an integral type value, which represents a byte displacement. The second must be a location description. It forms a new location description that describes a location at the given byte displacement from the original location. For a register location, the byte displacement is relative to the least-significant byte on a little-endian architecture, and to the most-significant byte on a big-endian architecture.

2. DW_OP_bit_offset

DW_OP_bit_offset pops two stack entries. The first (top of stack) must be an integral type value, which represents a bit displacement. The second must be a location description. It forms a new location description that describes a location at the given bit displacement from the original location.

On a little-endian architecture, the bit offset is relative to the least-significant bit of the location, and indicates that the new location is offset to the left by the given number of bits.

On a big-endian architecture, the bit offset is relative to the most-significant bit of the location, and indicates that the new location is offset to the right by the given number of bits.

A bit offset of n*8 is equivalent to a byte offset of n.

Section 5.7.6 Data Member Entries

[Page 118] In the description for DW_AT_data_member_location, change:

2. Otherwise, the value must be a location description. In this case, the beginning of the containing entity must be byte aligned. The beginning address location description of the containing entity is pushed on the DWARF stack before the location description is evaluated; the result of the evaluation is the base address of a location description for the member entry.

Section 5.14 Pointer to Member Type Entries

[Page 131] Change:

The DW_AT_use_location description is used in conjunction with the location descriptions for a particular object of the given pointer to member type and for a particular structure or class instance. The DW_AT_use_location attribute expects two values to be pushed onto the DWARF expression stack before the DW_AT_use_location description is evaluated. The first value pushed is the value of the pointer to member object itself. The second value pushed is the base address location description of the entire structure or union instance containing the member whose address is being calculated.

Section 7.7.1 DWARF Expressions

[Page 226] Add to Table 7.9:

                            No. of
Operation             Code  Operands  Notes
--------------------  ----  --------  -----
DW_OP_offset          TBA      0
DW_OP_bit_offset      TBA      0

Section D.2.1 Fortran Simple Array Example

[Page 296] In Figure D.4, change DW_OP_plus to DW_OP_offset in the following places:

Section D.2.3 Fortran 2008 Assumed-rank Array Example

[Page 302] In Figure D.13, change DW_OP_plus to DW_OP_offset in the following places: