Issue 240108.1: Add prologue_begin and epilogue_end state machine registers to allow identifying multiple prologue and epilogue regions

Author: Zoran Zaric
Champion:
Date submitted: 2024-01-08
Date revised:
Date closed:
Type: Enhancement
Status: Open
DWARF Version: 6

Background

Some compilers (LLVM for example) support generating multiple prologue and epilogue regions for the same function, while in DWARF 5 there is only a support for indicating the end of a prologue (prologue_end) or a beginning of an epilogue (epilogue_begin) in the form of line number state machine registers.

An example where a debugger can use this information is in the case of sliding breakpoints when the breakpoint is set on a function name. The debugger can determine if the target architecture has transiently incorrect CFI and needs for that breakpoint to slide until the instruction at the end of that prologue region.

Debuggers can also decide to use the epilogue_begin indicator to suspend the execution for user input, just prior to the exit of a function.

Another potential use case is when user steps through the function on an instruction level. A debugger can use the prologue/epilogue information to notify the user about being in a region of the code where backtrace command might show wrong information.

In GDB’s case, the information is also used for scoped (lexical or otherwise) watchpoints, where for watchpoints on a non-global variable (only valid within a given scope), GDB can verify if a currently executed instruction belongs to those regions. If the execution is stopped in such a region, debugger can continue the execution until it gets to a more stable point. Only when the execution is stopped at a stable point, the debugger can check if we are still within the scope of the watchpoint and if not the watchpoint can be disabled.

Transiently incorrect CFI can cause two sorts of problems. One is a wrong CFA information, which means that unwinding of the call stack may yield unstable results. Another problem is no longer valid CFI register rules which can result in a range of problems, from return address register not containing the actual return address, to local variables of the callee function having a stale location information.

In the case where a compiler can generate multiple prologue and epilogue regions for the same function, current line number state machine registers are not enough to describe if the execution is stopped in a transiently incorrect CFI region or not.

The proposed solution is to add two more state machine registers: prologue_begin and epilogue_end. These registers would represent the beginning of the prologue or the end of the epilogue region respectively.

Alternative approach is avoiding the use of line table section and defining these regions as a function DIE attribute in .debug_info section. The downside of that approach is that debuggers already process the existing prologue_end and epilogue_begin information as part of the line table information without any need to look into the .debug_info section for most line table related functions.

Proposed Changes

In Section 6.2.2 State Machine Registers, add prologue_begin and epilogue_end state machine register entries to Table 6.3:

Register Name Meaning
prologue_begin A boolean indicating that the current address is one (of possibly many addresses) of a prologue region. Register also indicates an entry point to the prologue region.
epilogue_end A boolean indicating that the current address is the last address in the contiguous epilogue region.

In the same table replace the description for epilogue_begin entry:

Register Name Meaning
epilogue_begin A boolean indicating that the current address is one (of possibly many) where execution should be suspended for a breakpoint just prior to the exit of a function.

with:

Register Name Meaning
epilogue_begin A boolean indicating that the current address is one (of possibly many addresses) where execution should be suspended for a breakpoint just prior to the exit of a function. Register also indicates an entry point to the epilogue region.

In Section 6.2.3 Line Number Program Instructions, add prologue_begin and epilogue_end initial state entries to Table 6.4:

Register Name Initial State
prologue_begin “false”
epilogue_end “false”

In Section 6.2.5.1 Special Opcodes, on page 160, replace the lines 17 and 18:

5. Set the prologue_end register to “false.”

6. Set the epilogue_begin register to “false.”

with:

5. Set the prologue_begin register to “false.”

6. Set the prologue_end register to “false.”

7. Set the epilogue_begin register to “false.”

8. Set the epilogue_end register to “false.”

In Section 6.2.5.2 Standard Opcodes, on page 162, replace the description of opcode DW_LNS_copy:

1. DW_LNS_copy

The DW_LNS_copy opcode takes no operands. It appends a row to the matrix using the current values of the state machine registers. Then it sets the discriminator register to 0, and sets the basic_block, prologue_end and epilogue_begin registers to “false.”

with:

1. DW_LNS_copy

The DW_LNS_copy opcode takes no operands. It appends a row to the matrix using the current values of the state machine registers. Then it sets the discriminator register to 0, and sets the basic_block, prologue_begin, prologue_end, epilogue_begin, and epilogue_end registers to “false.”

In the same section, on page 163, add a new opcode DW_LNS_set_prologue_begin right before the definition of the DW_LNS_set_prologue_end opcode with following text:

10. DW_LNS_set_prologue_begin

The DW_LNS_set_prologue_begin opcode takes no operands. It sets the prologue_begin register to “true.”

[non-normative] For functions containing multiple prologue regions, where an underlying architecture doesn’t guarantee a stable CFI, it is useful for debuggers to know where each of the regions begin.

In the same section, add a new opcode DW_LNS_set_epilogue_end right after the definition of the DW_LNS_set_epilogue_begin opcode with following text:

13. DW_LNS_set_epilogue_end

The DW_LNS_set_epilogue_end opcode takes no operands. It sets the epilogue_end register to “true.”

[non-normative] For functions containing multiple epilogue regions, where an underlying architecture doesn’t guarantee a stable CFI, it is useful for debuggers to know where each of the regions end.

In Section 7.2.2 Line Number Information, on page 236, add DW_LNS_set_prologue_begin and DW_LNS_set_epilogue_end entries to Table 7.25:

Opcode Name Value
DW_LNS_set_prologue_begin 0x0d
DW_LNS_set_epilogue_end 0x0e