# Issue 140425.1: Typed DWARF stack

Author: | Jakub Jelinek |
---|---|

Champion: | Jakub Jelinek |

Date submitted: | 2014-04-25 |

Date revised: | |

Date closed: | |

Type: | Enhancement |

Status: | Accepted |

DWARF Version: | 5 |

Section 2.5, pg Typed DWARF stack ================= Overview -------- The addition of DW_OP_stack_value operation in DWARF4 allowed the debugging information to describe even values of variables partially or completely optimized away, either in some ranges or everywhere. But the values are still computed on the DWARF stack which has only integral types with the size of the target machine's address, so it is still very hard to represent non-integral variables or variables with integral types larger than the size of address. Consider target with 32-bit addresses and 64-bit long long: void foo (unsigned long long x, unsigned long long y, double z, volatile int *p) { unsigned long long a = x + y; double b = z * 2.5; (*p)++; } On 64-bit architecture e.g. a could be described as (x86_64) DW_OP_breg5 0 DW_OP_breg4 0 DW_OP_plus DW_OP_stack_value because the addresses are 64-bit, but when they are 32-bit, one would have to resort to something like (i?86 little endian): DW_OP_fbreg 0 DW_OP_deref DW_OP_fbreg 8 DW_OP_deref DW_OP_plus \ DW_OP_stack_value DW_OP_piece 4 \ DW_OP_fbreg 4 DW_OP_deref DW_OP_fbreg 12 DW_OP_deref DW_OP_plus \ DW_OP_fbreg 0 DW_OP_deref DW_OP_plus_uconst 0x80000000 DW_OP_dup \ DW_OP_fbreg 8 DW_OP_deref DW_OP_plus DW_OP_gt DW_OP_plus \ DW_OP_stack_value DW_OP_piece 4 (i.e. low 32 bits are (unsigned) x + (unsigned) y, the high 32 bits are (unsigned) (x >> 32) + (unsigned (y >> 32) + ((unsigned) x > (unsigned) x + (unsigned) y) ), but for double address size multiplication this already would be much larger, and for IEEE double, while in theory implementable, would basically require writing a software IEEE floating point emulation library in DWARF expressions that one would use with DW_OP_call*. With the following proposal, a can use: DW_OP_fbreg 0 DW_OP_deref_type 8 <unsigned long long> \ DW_OP_fbreg 8 DW_OP_deref_type 8 <unsigned long long> \ DW_OP_plus DW_OP_stack_value and for b: DW_OP_fbreg 16 DW_OP_deref_type 8 <double> \ DW_OP_const_type <double> 8 <0x4004000000000000ULL> \ DW_OP_mul DW_OP_stack_value The extension (with DW_OP_GNU_ rather than DW_OP_ and without DW_OP_xderef_type) has been implemented for 3 years now in GCC/GDB. The DWARF stack is enhanced, so that instead of a stack element being just an address sized integer, the stack element is a pair of a type identifier and union which contains the address sized integer, various other integral and floating point (perhaps _Decimal/fixed point etc.) types. So, something along the lines of: struct DWARF_stack_element { int type_id; union { intptr_t address; long long llong; unsigned long long ullong; __int128_t i128t; __uint128_t u128t; float flt; double dbl; long double ldbl; ... } }; For compatibility reasons and also because most of the operations performed on the DWARF stack are still integral values with address sizes, plus in order to give more freedom to debug information consumers, the way this extension is proposed is that most operations (other than the newly added ones) if they don't pop anything from the stack and just push a new value push values with a special address type, DW_OP_{convert,reinterpret} refer to this as type with offset 0, while operations that consume stack elements as operands typically are overloaded on that type, if it has more than one operand require that all operands have the same type and usually push the same type of result value. Debug info consumers must handle as minimum (as before) at least the special address type, plus whatever other base types they choose to support. When they see any of the new DW_OP_{{regval,{,x}deref,const}_type,convert,reinterpret} operations refering to a base type that they don't support, they should just give up on trying to evaluate the whole expression and suggest to the user that the value is optimized away/can't be computed. So, debug info consumers can as well choose not to support any of the typed DWARF stack at all, as long as they are able to just parse the 6 new operations, anything above that is a QoI issue. Proposed changes to DWARF ------------------------- 2.5.1 Change: Each element of the stack is the size of an address on the target machine. to: Each element of the stack has a type and a value, and can represent a value of any supported base type of the target machine. Instead of a base type, elements can have a special address type, which is an integral type that has the size of an address on the target machine and unspecified signedness. Add after the paragraph: *While the abstract definition of the stack calls for variable-size entries able to hold any supported base type, in practice it is expected that each element of the stack can be represented as a fixed-size element large enough to hold a value of any type supported by the DWARF consumer for that target, plus a small identifier sufficient to encode the type of that element. Support for base types other than what is required to do address arithmetic is intended only for debugging of optimized code, and the completeness of the DWARF consumer's support for the full set of base types is a quality-of-implementation issue. If a consumer encounters a DWARF expression that uses a type it does not support, it should ignore the entire expression and report its inability to provide the requested information. It should also be noted that floating-point arithmetic is highly dependent on the computational environment. It is not the intention of this expression evaluation facility to produce identical results to those produced by the program being debugged while executing on the target machine. Floating-point computations in this stack machine will be done with precision control and rounding modes as defined by the implementation.* 2.5.1.1 Change: If the value of a constant in one of these operations is larger than can be stored in a single stack element, the value is truncated to the element size and the low-order bits are pushed on the stack. to: Operations other than DW_OP_const_type push a value with the special address type, and if the value of a constant in one of these operations is larger than can be stored in a single stack element of the special address type, the value is truncated to the element size and the low-order bits are pushed on the stack. Add at the end of section: 9. DW_OP_const_type The DW_OP_const_type operation takes three operands. The first operand is an unsigned LEB128 integer that represents the offset of a debugging information entry in the current compilation unit, which must be a DW_TAG_base_type entry that provides the type of the constant provided. The second operand is 1-byte unsigned integer that represents the size n of the constant, which may not be larger than the size of the largest supported base type of the target machine. The third operand is a block of n bytes to be interpreted as a value of the referenced type. *While the size of the constant could be inferred from the base type definition, it is encoded explicitly into the expression so that the expression can be parsed easily without reference to the .debug_info section.* 2.5.1.2 Change subsection title to: Register Values Change first sentence to: The following operations push a value onto the stack that is either the contents of a register or the result of adding the contents of a register to a given signed offset. DW_OP_regval_type pushes just the content of the register, with the given base type, while the other operations push a value of the register with the special address type plus given signed offset. Add as a new operation at the end of the list: 4. DW_OP_regval_type The DW_OP_regval_type operation takes two parameters. The first parameter is an unsigned LEB128 number, which identifies a register whose contents is to be pushed onto the stack. The second parameter is an unsigned LEB128 number that represents the offset of a debugging information entry in the current compilation unit, which must be a DW_TAG_base_type entry that provides the type of the value contained in the specified register. 2.5.1.3 Add after the first two sentences: The DW_OP_dup, DW_OP_drop, DW_OP_pick, DW_OP_over, DW_OP_swap and DW_OP_rot operations manipulate the elements of the stack as full pairs of type identifier and corresponding value. The DW_OP_deref, DW_OP_deref_size, DW_OP_xderef, DW_OP_xderef_size and DW_OP_form_tls_address operations require the popped values to have integral type, either special address type or some integral base type, and push a value with the special address type. DW_OP_deref_type and DW_OP_xderef_type operations have the same requirement on the popped values, but push a value with the requested type. All other operations push a value with the special address type. After DW_OP_deref_size description add new operation (and renumber all the following operations): 9. DW_OP_deref_type The DW_OP_deref_type operation behaves like the DW_OP_deref_size operation: it pops the top stack entry and treats it as an address. The value retrieved from that address is pushed. In the DW_OP_deref_type operation, the size in bytes of the data retrieved from the dereferenced address is specified by the first operand. This operand is a 1-byte unsigned integral constant whose value may not be larger than the size of the largest supported base type on the target machine. The second operand is an unsigned LEB128 integer that represents the offset of a debugging information entry in the current compilation unit, which must be a DW_TAG_base_type entry that provides the type of the data retrieved. After DW_OP_xderef_size description add new operation (and again renumber all the following operations): 12. DW_OP_xderef_type The DW_OP_xderef_type operation behaves like the DW_OP_xderef_size operation: it pops the top two stack entries, treats them as an address and an address space identifier, and pushes the value retrieved. In the DW_OP_xderef_type operation, the size in bytes of the data retrieved from the dereferenced address is specified by the first operand. This operand is a 1-byte unsigned integral constant whose value may not be larger than the size of the largest supported base type on the target machine. The second operand is an unsigned LEB128 integer that represents the offset of a debugging information entry in the current compilation unit, which must be a DW_TAG_base_type entry that provides the type of the data retrieved. 2.5.1.4 Rewrite first paragraph to: The following provide arithmetic and logical operations. If an operation pops two values from the stack, both values should have the same type, either the same base type or both should have the special address type. The result of the operation which is pushed back should have the same type as the type of the operands. If the type of the operands is the special address type, except as otherwise specified, the arithmetic operations perform addressing arithmetic, that is, unsigned arithmetic that is performed modulo one plus the largest representable address (for example, 0x100000000 when the size of an address is 32 bits). Operations other than DW_OP_abs, DW_OP_div, DW_OP_minus, DW_OP_mul, DW_OP_neg and DW_OP_plus require integral types of the operand (either integral base type or the special address type). Operations do not cause an exception on overflow. 2.5.1.5 Change: pop the top two stack values, to: pop the top two stack values, which should both have the same type, either same base type or both the special address type. After: push the constant value 1 onto the stack if the result of the operation is true or the constant value 0 if the result of the operation is false. add: The pushed constant value has the special address type. Change: Comparisons are performed as signed operations. to: If the operands have the special address type, the comparisons are performed as signed operations. 2.5.1.6 Renumber to 2.5.1.7, insert before that: 2.5.1.6 Type Conversions The following operation provides for explicit type conversion. 1. DW_OP_convert The DW_OP_convert operation pops the top stack entry, converts it to a different type, then pushes the result. It takes one operand, which is an unsigned LEB128 integer that represents the offset of a debugging information entry in the current compilation unit, or value 0 which represents the special address type. If the operand is non-zero, the referenced entry must be a DW_TAG_base_type entry that provides the type to which the value is converted. 2. DW_OP_reinterpret The DW_OP_reinterpret operation pops the top stack entry, reinterprets the bits in its value as a value of a different type, then pushes the result. It takes one operand, which is an unsigned LEB128 integer that represents the offset of a debugging information entry in the current compilation unit, or value 0 which represents the special address type. If the operand is non-zero, the referenced entry must be a DW_TAG_base_type entry that provides the type to which the value is converted. The type of the operand and result type should have the same size in bits. *The semantics of the reinterpretation of a value is as if in C or C++ there are two variables, one with the type of the operand, into which the popped value is stored, then copied using memcpy into the other variable with the type of the result and the pushed result value is the value of the other variable after memcpy.* -- 2014-07-15: Accepted.