Issue 240318.1: Describe prologue and epilogue ranges

Author:	Paul Robinson
Champion:
Date submitted:	2024-03-18
Date revised:	2024-04-17
Date closed:
Type:	Enhancement
Status:	Open
DWARF Version:	6

Background

Stopping Points

Ordinarily, a source-level debugger will prefer to pause execution of a program at instructions identified by the compiler as good places to do so. These include instructions flagged as is_stmt, prologue_end, or epilogue_begin. A user expects debug info such as source coordinates and variable locations to be sensible and useful at those points.

It is entirely possible for execution to pause at other instructions. There are a number of possible reasons for this.

The user has chosen to single-step instructions rather than statements.
The user has requested a breakpoint at a specific instruction that happens not to have any of the above flags.
An asynchronous exception has occurred and the debugger intercepted it.
The program has crashed and the user is looking at a core dump.

This list is not exhaustive.

Let's call the instruction where a debugger has paused execution (or the instruction where a crash was triggered) a "stopping point."

Prologue/Epilogue Ranges

In DWARF v3 thru v5, a subprogram's prologue(s) and epilogue(s) are described indirectly by the line table. A prologue generally consists of all instructions from an entry point up to the first executed instruction that is flagged as prologue_end. An epilogue generally consists of all instructions from an instruction flagged as epilogue_begin to where the subprogram returns to its caller. These groups of instructions implicitly form ranges. (These ranges might be empty.)

A subprogram might have multiple prologues if it has multiple entry points; more often, it might have multiple epilogues if it has multiple exit or return points. In particular, when there are multiple epilogues it is not necessarily clear when an epilogue ends and the next basic block (which might not be part of any epilogue) begins. (Even in the case of a single epilogue, a cold but functional basic block might be placed after the epilogue.)

Due to optimization, prologue or epilogue instructions might be mixed with other instructions, so in practice prologue and epilogue ranges might not be contiguous. DWARF does not have a way to describe these non-contiguous prologue and epilogue ranges. Compilers typically have various heuristics to pick stopping points for optimized prologue and epilogue ranges.

Single Location Descriptions

A single location description (which can be either a simple or composite location description) has the lifetime of its closest containing scope. The case we care about here is when that scope is a subprogram, and therefore the lifetime spans the entire subprogram. Pedantically, that lifetime includes prologue and epilogue ranges.

It is common practice for unoptimized code to allocate local variables to a stack frame, and use that stack location in the single location description. Because the stack frame is not necessarily in a valid state during prologue or epilogue code, in practice, debuggers typically assume that a single location description is not valid during a prologue or epilogue, although the DWARF spec does not explicitly say so (AFAIK).

Default Location Descriptions

A location list can have a "default location description" that is effectively a fallback single location description, to be used when no bounded location description in the same list applies. Prologue and epilogue considerations are the same as for single location descriptions.

Overview

A stopping point might occur during a prologue or epilogue range, which means single location descriptions for subprogram-scope objects might not be valid.

It would be good if the DWARF spec actually said single location descriptions were not necessarily valid in those ranges. This is simply codifying existing practice.
It would be good if debuggers could reliably identify prologue and epilogue ranges.

The proposal adds text that excludes prologues and epilogues from the implicit range of a subprogram-scope object, and adds a register to the line-table state machine to identify prologues and epilogues.

Like prologue_end and epilogue_begin, the new prologue_epilogue register is automatically reset after every row of the line table. At an entry point, it must be set explicitly to indicate the beginning of a prologue (if one exists). In an epilogue, it is automatically set by DW_LNS_set_epilogue_begin. This means in a function with one contiguous prologue and one contiguous epilogue, where the entire prologue or epilogue is described with a single row, the line-number program needs only one new opcode to support prologue_epilogue.

Note: I have not tried to determine whether this minimizes size in practice. It seems plausible that each prologue or epilogue would typically occupy only one row of the line table, so resetting the flag after emitting each row should minimize the size cost.

Proposed Changes

In Section 2.6 "Location Descriptions" modify the last sentence of item 1 to read as follows (adding the prologue/epilogue exclusion).

They are sufficient for describing the location of any object as long as its lifetime is either static or the same as the lexical block that owns it, excluding any prologue or epilogue ranges, and it does not move during its lifetime.

In Section 2.6.2 "Location Lists" (p.43) add a non-normative paragraph to the bullet for "Bounded location description" after the first normative paragraph.

The location description is valid even if the address range includes addresses within a prologue or epilogue range.

In Section 2.6.2 "Location Lists" (p.43) add a sentence at the end of the bullet for "Default location description."

As with simple location descriptions, the lifetime of a default location excludes any prologue or epilogue ranges.

In Section 6.2.2 "State Machine Registers" add the prologue_epilogue register to Table 6.3.

Register Name Meaning

prologue_epilogue A boolean indicating that the current row describes instructions within a prologue or epilogue range.

Register Name	Meaning
`prologue_epilogue`	A boolean indicating that the current row describes instructions within a prologue or epilogue range.

(Keep the prologue_end and epilogue_begin registers.)

In Section 6.2.3 "Line Number Program Instructions" add an entry to Table 6.4 "Line number program initial state."

State Register Initial State

prologue_epilogue "false"

State Register	Initial State
`prologue_epilogue`	"false"

In Section 6.2.5.1 "Special Opcodes" update the list of effects to add:

Set the prologue_epilogue register to "false."

(and change "seven" to "eight" on the next line.)

In Section 6.2.5.2 "Standard Opcodes", modify opcode descriptions as follows (exact text changes not specified for simplicity):

DW_LNS_copy: add prologue_epilogue to the list of registers set to "false."
DW_LNS_set_epilogue_begin: add that it sets the prologue_epilogue register to "true."

In Section 6.2.5.3 "Extended Opcodes" add a new opcode at the end.

4. DW_LNE_set_prologue_epilogue

The DW_LNE_set_prologue_epilogue opcode takes no operands. It sets the prologue_epilogue register to "true."

In Section 7.22 "Line Number Information" add a new entry for DW_LNE_set_prologue_epilogue in Table 7.26 (probably 0x05).

Dependencies

Not really a dependency, but an implication:

Assemblers will need to add syntax to the .loc directive to support setting the prologue_epilogue flag.

References

Issue 240108.1: Add prologue_begin and epilogue_end state machine registers to allow identifying multiple prologue and epilogue regions

2024-03-20: prologue_epilogue register is no longer "sticky"; clarified that prologue_epilogue register applies to a range; parenthetical remarks upgraded to non-parenthetical.

2024-04-17: Revised to exclude prologue and epilogue ranges from default locations.