Issue 020612.1: Thread-local Storage

Author: Jim Blandy
Champion: Jim Blandy
Date submitted: 2002-06-12
Date revised:
Date closed:
Type: Extension
Status: Accepted with modifications
DWARF Version: 3
I'd like to propose an addition to Dwarf.  Red Hat is putting together
an initial implementation of this as a vendor extension; we'll report
back to this list once it's done, and make this into an official
proposal.

(The function of DW_OP_push_tls_address is similar to that of the
run-time function __get_tls_addr, so it was tempting to name the
operation DW_OP_get_tls_addr.  However, DW_OP_push_tls_address is more
consistent with the names of the other operations, especially
DW_OP_push_object_address.)


To section 2.4.1.3, "Stack Operations", add the following paragraphs
after DW_OP_push_object_address:

12. DW_OP_push_tls_address

The DW_OP_push_tls_address operation pushes the base address of the
current thread's thread-local storage block.  If the expression occurs
in the Dwarf information for a dynamically loaded library, then
DW_OP_push_tls_address pushes the base address of that library's block
for the current thread.  If the library's storage for the current
thread has not yet been allocated, a Dwarf consumer may arrange for it
to be allocated now, or report an error to the user.

<rationale italics>
Some implementations of C and C++ support a “__thread” storage
class, for variables that occupy distinct memory in distinct threads.
For example, the definition:

       __thread int foo;

declares an integer variable named “foo” which has a separate value
and address in each thread, much as a variable declared “auto” has a
separate value and address in each invocation of the function
containing its declaration.  Creating a new thread creates a new
instance of “foo”, and when the thread exits, the storage for
“foo” is freed.

Typically, a program includes an “initialization image” --- a block
of memory containing the initial values for any thread-local variables
it defines.  When the program creates a new thread, the run-time
system allocates a fresh block of memory for those thread-local
variables, and copies the initialization image into it to give the
variables their initialized values.

A dynamically loaded library may also define thread-local variables.
Some implementations delay allocating memory for such variables until
the thread actually refers to them for the first time.  This avoids
the overhead of allocating and initializing the library's thread-local
storage for all the threads present in a program when the library is
loaded, even though only a few threads might actually use the library.

However, when an implementation allocates thread-local storage on
demand, this makes it hard to describe the location of a thread-local
variable using ordinary Dwarf expressions: referencing the storage may
entail allocating memory, copying an initialization image into place,
registering it with the thread, and so on.  A dedicated operation like
DW_OP_push_tls_address leaves delegates the request to the debugger,
which is presumably already familiar with the program's ABI and thread
system, and can handle the request appropriately.
</rationale italics>

---------------------
added 05/11/2005:

PROPOSED TEXT:

[To section 2.4.1.3, "Stack Operations", add the following paragraphs
after DW_OP_push_object_address]

12. DW_OP_push_tls_address

The DW_OP_push_tls_address operation pops a value from the stack, and
treats it as an offset into a thread-local storage block.  If the
DWARF expression containing the DW_OP_push_tls_address operation
belongs to the main executable's DWARF info, the main executable's
thread-local storage block is used; if the expression belongs to a
shared library's DWARF info, then that shared library's thread-local
storage block is used.  The operation then pushes the address of the
byte at the given offset in the current thread's instance of the
thread-local storage block.

If the chosen thread-local storage block has not yet been allocated, a
DWARF consumer may arrange for it to be allocated immediately, or
report an error to the user.

<rationale italics>
Some implementations of C and C++ support a “__thread” storage
class.  Variables having this storage class occupy distinct memory
locations in distinct threads.  For example, the C declaration:

       __thread int foo;

defines an integer variable named “foo” which has a separate value
and address in each thread, much as a variable declared “auto” has a
separate value and address in each invocation of the function
containing its declaration.  Creating a new thread creates a new
instance of “foo”, and when the thread exits, the storage for
“foo” is freed.

Typically, a program includes an “initialization image” --- a block
of memory containing the initial values for any thread-local variables
it defines.  When the program creates a new thread, the run-time
system allocates a fresh block of memory for those thread-local
variables, and copies the initialization image into it to give the
variables their initialized values.

A dynamically loaded library may also define thread-local variables.
Some implementations delay allocating memory for such variables for
each thread until the thread actually refers to them for the first
time.  This avoids the overhead of allocating and initializing the
library's thread-local storage for all the threads present in a
program when the library is loaded, in cases where only a few threads
actually use the library.

However, when an implementation allocates thread-local storage on
demand, this makes it hard to describe the location of a thread-local
variable using ordinary Dwarf expressions: referencing the storage may
entail allocating memory, copying an initialization image into place,
registering the memory with the thread, and so on.  A dedicated
operation like DW_OP_push_tls_address leaves delegates the request to
the debugger, which is presumably already familiar with the program's
ABI and thread system, and can handle the request appropriately.
</rationale italics>

[To Figure 22 add the following row after DW_OP_call_ref and before
DW_OP_lo_user]

Operation              Code No of operand Notes
DW_OP_push_tls_address 0x9b 0             [Empty]

=============================================================
5/17/2005:  Accepted with modifications:

[To section 2.4.1.3, "Stack Operations", add the following paragraphs
after DW_OP_push_object_address]

12. DW_OP_form_tls_address

The DW_OP_form_tls_address operation pops a value from the stack,
translates it into an address in the current thread's thread-local
storage block, and pushes the address.  If the DWARF expression
containing the DW_OP_form_tls_address operation belongs to the main
executable's DWARF info, the operation uses the main executable's
thread-local storage block; if the expression belongs to a shared
library's DWARF info, then it uses that shared library's thread-local
storage block.

*Some implementations of C and C++ support a “__thread” storage
class.  Variables with this storage class have distinct values and
addresses in distinct threads, much as “auto” variables have
distinct values and addresses in each function invocation.

Typically, there is a single block of storage containing all __thread
variables declared in the main executable, and a separate block for
the variables declared in each dynamically loaded library.  Computing
the address of the appropriate block can be complex (in some cases,
the compiler emits a function call to do it), and difficult to
describe using ordinary Dwarf location expressions.
DW_OP_form_tls_address leaves the computation to the consumer.*

[To Figure 22 add the following row after DW_OP_call_ref and before
DW_OP_lo_user]

Operation              Code No of operand Notes
DW_OP_form_tls_address 0x9b 0             [Empty]