Issue 230324.1: Expression Operation Vendor Extensibility Opcode

Author: Scott Linder
Champion: Mark Wielaard
Date submitted: 2023-03-24
Date revised:
Date closed: 2024-01-08
Type: Enhancement
Status: Accepted with changes
DWARF Version: 6

Background

The vendor extension encoding space for DWARF expression operations accommodates only 32 unique operations. In practice, the lack of a central registry and a desire for backwards compatibility means vendor extensions are never retired, even when standard versions are accepted into DWARF proper. This has produced a situation where the effective encoding space available for new vendor extensions is miniscule today.

To expand this encoding space we propose defining one DWARF operation in the official encoding space which acts as a "prefix" for vendor extensions. It is followed by a ULEB128 encoded vendor extension opcode, which is then followed by the operands of the corresponding vendor extension operation.

This scheme opens up an infinite encoding space for arbitrary vendor extensions, and in practical terms is no less compact than if a fixed-size encoding were chosen, as was done for DW_LNS_extended_op. That is to say, when compared with an alternative scheme which encodes the opcode with a single unsigned byte: for the first 127 opcodes our approach is indistinguishable from the alternative scheme; for the next 128 opcodes it requires one more byte than that alternative scheme; and after 255 opcodes the alternative scheme is exhausted.

Since vendor extension operations can have arbitrary semantics, the consumer must understand them to be able to continue evaluating the expression. The only use for a size operand would be for a consumer that only needs to print the expression. Omitting a size operand makes the operation encoding more compact, and this was deemed more important than the limited printing use case. Therefore no ULEB128 size operand is present to provide the number of bytes of following operands, unlike DW_LNS_extended_op.

A centralized registry of vendor extension opcodes which are in use, maintained on the dwarfstd.org website or another suitable location, could also be implemented as a part of this proposal. This would remove the need for vendors to coordinate allocation themselves, and make it simpler to use more than one vendor extension at a time. As there is support for an infinite number of opcodes, the registration process could involve very limited review, and would therefore pose a minimal burden to the maintainer of such a registry.

Proposal

In Section 2.5.1.7, p38, add a new code at the end of the list:

3. DW_OP_user

   The DW_OP_user opcode encodes a vendor extension operation. It has at
   least one operand: a ULEB128 constant identifying a vendor extension
   operation. The remaining operands are defined by the vendor extension.
   The vendor extension opcode 0 is reserved and cannot be used by any
   vendor extension.

   <i>The DW_OP_user encoding space can be understood to supplement the
   space defined by DW_OP_lo_user and DW_OP_hi_user that is allocated by
   the standard for the same purpose.</i>

In Section 7.7.1, p226, add a new row to table 7.9:

DW_OP_user  |  TBD  |  1+  | ULEB128 vendor extension opcode, followed by
            |       |      | vendor-extension-defined operands

2024-01-08: Accepted, with change from DW_OP_user to DW_OP_user_extended.