Issue 250506.1: Improve Support for Finding vtables
| Author: | Cary Coutant |
|---|---|
| Champion: | Cary Coutant |
| Date submitted: | 2025-05-06 |
| Date revised: | 2025-11-10 |
| Date closed: | 2025-11-10 |
| Type: | Enhancement |
| Status: | Accepted |
| DWARF version: | 6 |
This is the first part of a three-part proposal.
This first part proposes a standard mechanism for locating
the virtual function table (vtable) given an object of a
polymorphic class.
The second part, 250506.2, proposes a standard mechanism
for identifying the most-derived class of an object,
given its vtable location, in order to support downcasting
of pointers while debugging.
The third part, 250506.3, proposes a fix to the
DW_AT_vtable_elem_location attribute,
which appears to be incorrectly implemented in compilers today.
Background
From Kyle Huey [Huey2025]:
Consider the following C++ program:
#include <stdio.h> class Base { public: virtual const char* method1() = 0; void method2() { printf("%s\n", method1()); } }; class DerivedOne : public Base { virtual const char* method1() override { return "DerivedOne"; } }; template<typename T> class DerivedTwo : public Base { public: DerivedTwo(T t) : t(t) {} private: virtual const char* method1() override { return t; } T t; }; template<typename T> class DerivedThree : public Base { public: DerivedThree(T t) : t(t) {} private: virtual const char* method1() override { return t(); } T t; }; int main() { DerivedOne d1; DerivedTwo d2("DerivedTwo"); DerivedThree d3([]() { return "DerivedThree"; }); d1.method2(); d2.method2(); d3.method2(); return 0; }If a debugger stops at method1, the
DW_TAG_formal_parameterwill tell the debugger the type ofthisisBase. Downcasting to the derived type is very useful for the programmer though, so both gdb and lldb contain a feature to downcast based on the vtable pointer (the "print object" and the "target.prefer-dynamic" settings in the respective debuggers).The first part of this is straightforward. The DWARF for
Basewill contain a member for the vtable pointer, and that plus knowledge of how the ABI lays out vtables allows the debugger to effectively do adynamic_cast<void*>to obtain a pointer to the most derived object. From there the vtable address is compared against the ELF symbol table to find the mangled name of the vtable symbol.Then things begin to get hairy. The debugger demangles the mangled name that exists in the ELF symbol table, chops off the "vtable for" prefix on the demangled name, and searches for the type by name in the DWARF. If it finds the type, it adjusts the type of the value and prints it accordingly. But this text based matching doesn't always work. There are no mangled names for types so the debugger's demangling has to match the compiler's output character for character.
In the example program I've provided, when using the respective compilers, gdb can successfully downcast
DerivedOneandDerivedThreebut notDerivedTwo. gdb fails because gcc emits theDW_TAG_class_typewith aDW_AT_name "DerivedTwo<main()::<lambda()> >"but libiberty demangles the vtable symbol to"vtable for DerivedTwo<main::{lambda()#1}>"and those do not match. lldb can only successfully downcastDerivedOne. lldb appears to not handle classes with template parameters correctly at all. And even if all of that were fixed, libiberty and llvm disagree about how to demangle the symbol forDerivedTwo's vtable, so the two ecosystems would not be interoperable.Perhaps these are merely quality of implementation issues and belong on the respective bug trackers, however, better representations are possible. Rustc, for example, does not rely on the ELF symbol table and demangled string matching. It emits a global variable in the DWARF whose location is the address of the vtable. That variable has a
DW_AT_typepointing to aDW_TAG_class_typethat describes the layout of the vtable, and that type has aDW_AT_containing_typethat points to the type making use of that vtable.
History & References
The artificial member for the vtable pointer appears to be a DWARF extension requested as far back as 2003 and implemented in 2009, in GCC PR 11208.
But I can't find any relevant discussion on the DWARF mailing lists, until a question Louzon2022 arose about that very member in 2022.
Given the apparent need for this information in the DWARF info, we should have addressed it in DWARF by now. I suspect the DWARF committee's position was (or would have been) that the ABI tells you how to find the vtable so it doesn't need to be explicitly recorded in the DWARF info. But if both GCC and LLVM have decided it's useful enough (and there's discussion about that point in the original PR that PR 11208 spun off from), then we should discuss it. Otherwise, we risk having different toolchains adopt different solutions. (GCC and LLVM appear to have avoided that through careful consideration of what the other project was doing.) The argument in PR 11208 is that it's legal in DWARF to do this, so no new DWARF feature was requested.
The request in PR 11208 was for three things:
1) I'd like to be able to locate the vtable pointer in the class structure so that the debugger knows that the hole in the apparent layout is not padding.
2) I'd like to know the type of the target of the vtable pointer, so that if the user asks to see it they see something sane.
3) I'd like to be able to find a specific virtual functions entry in the vtable, however I believe that this information will be best expressed as a property of the function, not directly of the class or vtable. DWARF3 has the
DW_AT_vtable_elem_locationattribute for precisely this information. gcc should generate that too.Quoting the DWARF spec again :- An entry for a virtual function also has a
DW_AT_vtable_elem_locationattribute whose value contains a location description yielding the address of the slot for the function within the virtual function table for the enclosing class. The address of an object of the enclosing type is pushed onto the expression stack before the location description is evaluated.
Request #1 was satisfied in GCC by creating an artificial member whose
DW_AT_data_member_location is the offset of the vtable pointer.
LLVM used a similar approach, but Concurrent [Allen2025]
created a new DW_AT_vtable_location attribute to compute the location
of the vtable, given the address of an object of the class.
Request #3 was resolved by implementing the
DW_AT_vtable_elem_location attribute (but see "Problems" in 250506.3).
Request #2 was not resolved in GCC, but has been more
recently addressed by clang PR 130255
and Rust Issue 125126, by creating
an artificial global variable whose location is the vtable
and ties that back to the class definition. The mechanisms
used by the two compilers differ slightly.
Concurrent [Allen2025] added a new
DW_AT_type_vtable_location attribute to provide the
address of the vtable for a given class type.
Problem
The first request in PR 7081
(and later split off into PR 11208)
was for a standardized way in the DWARF specification
to find the vtable. The current approach of using
an artificial member with a certain DW_AT_name is not
standardized across compilers.
Proposal
To find the vtable for a object of a class or structure type,
we add a new DW_AT_vtable_location attribute to the structure or class
type DIE.
In Section 2.2, "Attribute Types," add a row to Table 2.2:
Attribute Usage DW_AT_vtable_locationLocation of the virtual table
In Section 5.7.1, "Structure, Union and Class Type Entries", add the following paragraphs:
A structure, union, or class type may have a
DW_AT_vtable_locationattribute, whose value contains a location expression that evaluates to the location of the virtual table (vtable) for an object of that class. The location of an object of that type is implicitly pushed onto the DWARF stack prior to evaluating the location expression.If a class type has more than one virtual table, the
DW_AT_vtable_locationattribute provides the location of the virtual table to which theDW_AT_vtable_elem_indexattribute of its member function entries refers (see Section 5.7.8, "Member Function Entries").If a class has no
DW_AT_vtable_locationattribute, it inherits the virtual table location from the first base class whoseDW_AT_data_member_locationattribute is 0, and which has a virtual table (which also could be inherited from its base class). The correct vtable location for a class can be found with a pre-order traversal of the inheritance hierarchy with that condition.If no
DW_AT_vtable_locationattribute can be found in the inheritance hierarchy, the location of the virtual table (if one is required for the type) is determined by the ABI.In many implementations, the vtable pointer is at offset 0, and the
DW_AT_vtable_locationexpression can be a single operator:DW_OP_deref.
In Section 7.5.4, "Attribute Encodings," add a row to Table 7.5:
Attribute name Value Classes DW_AT_vtable_locationTBD exprloc
In Appendix A, "Attribute by Tag," add DW_AT_vtable_location
to the following tags: DW_TAG_structure_type, DW_TAG_class_type.
2025-09-26: Revised.
Added note covering multiple virtual tables;
Added rule for derived classes and missing attribute;
Removed DW_OP_push_vtable_location.
2025-10-07: Revised wording about multiple vtables; added non-normative text about typical location expression.
2025-10-23: Revised wording about primary base class.
2025-11-06: Revised.
Changed "first non-virtual base class" to
"base class with DW_AT_data_member_location of 0".
2025-11-10: Accepted with new wording.