Issue 090109.1: Support C++0x new string literals

Author: Kendrick Wong
Champion: Kendrick Wong
Date submitted: 2009-01-09
Date revised:
Date closed:
Type: Enhancement
Status: Accepted
DWARF Version: 4
Background
----------

For detail description of the feature, please refer to:
http://en.wikipedia.org/wiki/C%2B%2B0x#New_string_literals
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2442.htm 


Overview
--------

C++0x will support three Unicode encodings: UTF-8, UTF-16 and UTF-32.
*char* has a size big enough to hold a UTF-8 character and can be used
to hold UTF-8 characters.

Two new types are also introduced:

    * *char16_t* has a size big enough to hold a UTF-16 character and
      can be used to hold UTF-16 characters.
    * *char32_t* has a size big enough to hold a UTF-32 character and 
      can be used to hold UTF-32 characters. 

Purpose
    * New base type debug information entries are required to represent
      char16_t and char32_t.
    * The combination of existing DW_AT_encoding and DW_AT_byte_size can 
      be used to describe character type of various sizes.
    * New debug information is needed to describe the encoding used for 
      the character type. This can be achieved either by creating new 
      values for DW_AT_encoding or by creating new attributes.
    * This proposal describes new attributes to describe the UTF8/16/32 
      encoding. 


Proposed Changes to the DWARF Specification
-------------------------------------------

Purpose

    * New base type debug information entries are required to represent 
      char16_t and char32_t.
    * New debug information is needed to describe the encoding used for
      the character type.
    * This proposal describes new base type encodings for the UTF8/16/32 encoding.

7.8 Base Type Encodings

DW_ATE_utf  0x10


5.1: Base Type Entries

*For example, the C++ type char16_t is represented by a base type entry
with a name attribute whose value is "char16_t", an encoding attribute
whose value is DW_ATE_utf and a byte size attribute whose value is 2.*


Appendix
D.9 UTF character type examples
 
char16_t chr_a = u'h';
char32_t chr_b = U'h';

1$:  DW_TAG_base_type
         DW_AT_name("char16_t")
         DW_AT_encoding(DW_ATE_utf)
         DW_AT_byte_size(2)
2$:  DW_TAG_base_type
         DW_AT_name("char32_t")
         DW_AT_encoding(DW_ATE_utf)
         DW_AT_byte_size(4)
3$:  DW_TAG_variable
         DW_AT_name("chr_a")
         DW_AT_type(reference to 1$)
4$:  DW_TAG_variable
         DW_AT_name("chr_b")
         DW_AT_type(reference to 2$)

-- 
Revised history:
March 17, 2009 -- Add DW_ATE_utf to describe UTF encoding
April 20, 2009 -- Re-format DWARF description in Example section to 
                  match current documentation.