Issue 250506.1: Improve Support for Finding vtables
Author: | Cary Coutant |
---|---|
Champion: | Cary Coutant |
Date submitted: | 2025-05-06 |
Date revised: | |
Date closed: | |
Type: | Enhancement |
Status: | Open |
DWARF version: | 6 |
Depends on Issue 230524.1.
This is the first part of a three-part proposal.
This first part proposes a standard mechanism for locating
the virtual function table (vtable) given an object of a
polymorphic class.
The second part, 250506.2, proposes a standard mechanism
for identifying the most-derived class of an object,
given its vtable location, in order to support downcasting
of pointers while debugging.
The third part, 250506.3, proposes a fix to the
DW_AT_vtable_elem_location
attribute,
which appears to be incorrectly implemented in compilers today.
Background
From Kyle Huey [Huey2025]:
Consider the following C++ program:
#include <stdio.h> class Base { public: virtual const char* method1() = 0; void method2() { printf("%s\n", method1()); } }; class DerivedOne : public Base { virtual const char* method1() override { return "DerivedOne"; } }; template<typename T> class DerivedTwo : public Base { public: DerivedTwo(T t) : t(t) {} private: virtual const char* method1() override { return t; } T t; }; template<typename T> class DerivedThree : public Base { public: DerivedThree(T t) : t(t) {} private: virtual const char* method1() override { return t(); } T t; }; int main() { DerivedOne d1; DerivedTwo d2("DerivedTwo"); DerivedThree d3([]() { return "DerivedThree"; }); d1.method2(); d2.method2(); d3.method2(); return 0; }
If a debugger stops at method1, the
DW_TAG_formal_parameter
will tell the debugger the type ofthis
isBase
. Downcasting to the derived type is very useful for the programmer though, so both gdb and lldb contain a feature to downcast based on the vtable pointer (the "print object" and the "target.prefer-dynamic" settings in the respective debuggers).The first part of this is straightforward. The DWARF for
Base
will contain a member for the vtable pointer, and that plus knowledge of how the ABI lays out vtables allows the debugger to effectively do adynamic_cast<void*>
to obtain a pointer to the most derived object. From there the vtable address is compared against the ELF symbol table to find the mangled name of the vtable symbol.Then things begin to get hairy. The debugger demangles the mangled name that exists in the ELF symbol table, chops off the "vtable for" prefix on the demangled name, and searches for the type by name in the DWARF. If it finds the type, it adjusts the type of the value and prints it accordingly. But this text based matching doesn't always work. There are no mangled names for types so the debugger's demangling has to match the compiler's output character for character.
In the example program I've provided, when using the respective compilers, gdb can successfully downcast
DerivedOne
andDerivedThree
but notDerivedTwo
. gdb fails because gcc emits theDW_TAG_class_type
with aDW_AT_name "DerivedTwo<main()::<lambda()> >"
but libiberty demangles the vtable symbol to"vtable for DerivedTwo<main::{lambda()#1}>"
and those do not match. lldb can only successfully downcastDerivedOne
. lldb appears to not handle classes with template parameters correctly at all. And even if all of that were fixed, libiberty and llvm disagree about how to demangle the symbol forDerivedTwo
's vtable, so the two ecosystems would not be interoperable.Perhaps these are merely quality of implementation issues and belong on the respective bug trackers, however, better representations are possible. Rustc, for example, does not rely on the ELF symbol table and demangled string matching. It emits a global variable in the DWARF whose location is the address of the vtable. That variable has a
DW_AT_type
pointing to aDW_TAG_class_type
that describes the layout of the vtable, and that type has aDW_AT_containing_type
that points to the type making use of that vtable.
History & References
The artificial member for the vtable pointer appears to be a DWARF extension requested as far back as 2003 and implemented in 2009, in GCC PR 11208.
But I can't find any relevant discussion on the DWARF mailing lists, until a question Louzon2022 arose about that very member in 2022.
Given the apparent need for this information in the DWARF info, we should have addressed it in DWARF by now. I suspect the DWARF committee's position was (or would have been) that the ABI tells you how to find the vtable so it doesn't need to be explicitly recorded in the DWARF info. But if both GCC and LLVM have decided it's useful enough (and there's discussion about that point in the original PR that PR 11208 spun off from), then we should discuss it. Otherwise, we risk having different toolchains adopt different solutions. (GCC and LLVM appear to have avoided that through careful consideration of what the other project was doing.) The argument in PR 11208 is that it's legal in DWARF to do this, so no new DWARF feature was requested.
The request in PR 11208 was for three things:
1) I'd like to be able to locate the vtable pointer in the class structure so that the debugger knows that the hole in the apparent layout is not padding.
2) I'd like to know the type of the target of the vtable pointer, so that if the user asks to see it they see something sane.
3) I'd like to be able to find a specific virtual functions entry in the vtable, however I believe that this information will be best expressed as a property of the function, not directly of the class or vtable. DWARF3 has the
DW_AT_vtable_elem_location
attribute for precisely this information. gcc should generate that too.Quoting the DWARF spec again :- An entry for a virtual function also has a
DW_AT_vtable_elem_location
attribute whose value contains a location description yielding the address of the slot for the function within the virtual function table for the enclosing class. The address of an object of the enclosing type is pushed onto the expression stack before the location description is evaluated.
Request #1 was satisfied in GCC by creating an artificial member whose
DW_AT_data_member_location
is the offset of the vtable pointer.
LLVM used a similar approach, but Concurrent [Allen2025]
created a new DW_AT_vtable_location
attribute to compute the location
of the vtable, given the address of an object of the class.
Request #3 was resolved by implementing the
DW_AT_vtable_elem_location
attribute (but see "Problems" below).
Request #2 was not resolved in GCC, but has been more
recently addressed by clang PR 130255
and Rust Issue 125126, by creating
an artificial global variable whose location is the vtable
and ties that back to the class definition. The mechanisms
used by the two compilers differ slightly.
Concurrent [Allen2025] added a new
DW_AT_type_vtable_location
attribute to provide the
address of the vtable for a given class type.
Problem
The first request in PR 7081
(and later split off into PR 11208)
was for a standardized way in the DWARF specification
to find the vtable. The current approach of using
an artificial member with a certain DW_AT_name
is not
standardized across compilers.
Proposal
To find the vtable for a object of a class or structure type,
we add a new DW_AT_vtable_location
attribute to the structure or class
type DIE.
In Section 2.2, "Attribute Types," add a row to Table 2.2:
Attribute Usage DW_AT_vtable_location
Location of the virtual function table
In Section 5.7.1, "Structure, Union and Class Type Entries", add the following paragraph:
A structure, union, or class type may have a
DW_AT_vtable_location
attribute, whose value contains a location expression that evaluates to the location of the virtual function table (vtable) for an object of that class. The location of an object of that type is implicitly pushed onto the DWARF stack prior to evaluating the location expression.
In the Itanium C++ ABI, the vtable pointer is at offset 0, and
the DW_AT_vtable_location
expression could be a single operator:
DW_OP_deref
.
In Section 7.5.4, "Attribute Encodings," add a row to Table 7.5:
Attribute name Value Classes DW_AT_vtable_location
TBD exprloc
In Appendix A, "Attribute by Tag," add DW_AT_vtable_location
to the following tags: DW_TAG_structure_type
, DW_TAG_class_type
.
To make use of the vtable location in an expression,
we add a new DW_OP_push_vtable_location
operation.
In Section 3.6, "Context Query Operations," add the following operation:
5.
DW_OP_push_vtable_location
TheDW_OP_push_vtable_location
operation pushes the location of the virtual function table (vtable) for the current object (see Section 3.1) onto the stack. This is obtained by evaluating theDW_AT_vtable_location
expression of the type of the current object. If there is noDW_AT_vtable_location
attribute for the type, the vtable location depends on the ABI.
In Section 7.7.1, "DWARF Expressions," add a row to Table 7.9:
Operation Code # Operands Notes DW_OP_push_vtable_location
TBD 0