Issue 251013.1: Standardize LEB terminology
| Author: | Ron Brender |
|---|---|
| Champion: | |
| Date submitted: | 2025-10-13 |
| Date revised: | |
| Date closed: | 2025-10-27 |
| Type: | Editorial |
| Status: | Accepted with change |
| DWARF version: | 6 |
BACKGROUND
Consider the following data regarding the use of "[un]signed LEB128" vs "[S|U]LEB128" and friends in DWARF V5.
| Text | # of Occurences |
|---|---|
| signed LEB128 | 101 |
| unsigned LEB128 | 84 |
| SLEB128 | 11 |
| ULEB128 | 56 |
| SLEB | 1 |
| ULEB | 9 |
| LEB | 5 |
These data slightly overcount the occurences because they use the LaTeX sources for the document, which includes some occurences in command definitions and other non-visible text. But I am sure they are representative nonetheless.
As can be seen, "signed LEB128" and "unsigned LEB128" are much more common than than "SLEB128" and "ULEB128", but both are common. The shortest forms, "SLEB" and "ULEB", occur rarely but do occur. (Perhaps clerical errors born of wishful thinking.) This leads to the following proposal.
PROPOSAL
Standardize on the forms "SLEB" and "ULEB" throughout the document. Other forms should occur only in the context of defining this notation.
Because there is only one size of encoding, allowing up to 128 bits, inclusion of "128" in these names is redundant and unnecessary. Further, the more compact form, eg, SLEB, is preferred to the longer form, in this case signed LEB, simply becasue it is more compact.
It is impractical to show all of the individual changes that this proposal would induce. Following are a few selections for illustration.
-
In 2.5.1.2 Register Values, the text
DW_OP_fbreg
TheDW_OP_fbregoperation provides a signed LEB128 offset from the address specified by the location description in theDW_AT_frame_baseattribute of the current function.becomes
DW_OP_fbreg
TheDW_OP_fbregoperation provides a SLEB offset from the address specified by the location description in theDW_AT_frame_baseattribute of the current function. -
In Section 7.6 Variable Length Data, the definition of ULEB and SLEB becomes:
Integers may be encoded using “Little-Endian Base 128” (LEB128) numbers. LEB128 is a scheme for encoding integers densely that exploits the assumption that most integers are small in magnitude. This encoding is equally suitable whether the target machine architecture represents data in big-endian or little-endian byte order. It is “little-endian” only in the sense that it avoids using space to represent the “big” end of an unsigned integer, when the big end is all zeroes or sign extension bits.
Unsigned LEB128 (ULEB) numbers are encoded as follows: start at the low order end of an unsigned integer and chop it into 7-bit chunks...
Table 7.7 gives some examples of ULEB numbers...
The encoding for signed, two’s complement LEB128 (SLEB) numbers is similar, except that the criterion for discarding high order bytes is not whether they are zero, but whether they consist entirely of sign extension bits... Note that there is nothing within the LEB128 representation that indicates whether an encoded number is signed or unsigned... Table 7.7 gives some examples of ULEB numbers and Table 7.8 gives some examples of SLEB numbers.
-
In Section 7.7.1, this fragment from Table 7.9 DWARF operation encodings
Operation Code # Opnds Notes DW_OP_regx 0x90 1 ULEB128 register DW_OP_fbreg 0x91 1 SLEB128 offset DW_OP_bregx 0x92 2 ULEB128 register,
SLEB128 offsetbecomes
Operation Code # Opnds Notes DW_OP_regx 0x90 1 ULEB register DW_OP_fbreg 0x91 1 SLEB offset DW_OP_bregx 0x92 2 ULEB register,
SLEB offset
2025-10-27: Accepted, with editorial change. Remove reference to Table 7.7 from the paragraph about signed LEB128.