DWARF Standard


HOME
SPECIFICATIONS
FAQ
ISSUES



090109.1 Kendrick Wong Support C++0x new string literals Enhancement Accepted Kendrick Wong


Background
----------

For detail description of the feature, please refer to:
http://en.wikipedia.org/wiki/C%2B%2B0x#New_string_literals
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2442.htm 


Overview
--------

C++0x will support three Unicode encodings: UTF-8, UTF-16 and UTF-32.
char has a size big enough to hold a UTF-8 character and can be used
to hold UTF-8 characters.

Two new types are also introduced:

    * char16_t has a size big enough to hold a UTF-16 character and
      can be used to hold UTF-16 characters.
    * char32_t has a size big enough to hold a UTF-32 character and 
      can be used to hold UTF-32 characters. 

Purpose
    * New base type debug information entries are required to represent
      char16_t and char32_t.
    * The combination of existing DW_AT_encoding and DW_AT_byte_size can 
      be used to describe character type of various sizes.
    * New debug information is needed to describe the encoding used for 
      the character type. This can be achieved either by creating new 
      values for DW_AT_encoding or by creating new attributes.
    * This proposal describes new attributes to describe the UTF8/16/32 
      encoding. 


Proposed Changes to the DWARF Specification
-------------------------------------------

Purpose

    * New base type debug information entries are required to represent 
      char16_t and char32_t.
    * New debug information is needed to describe the encoding used for
      the character type.
    * This proposal describes new base type encodings for the UTF8/16/32 encoding.

7.8 Base Type Encodings

DW_ATE_utf  0x10


5.1: Base Type Entries

For example, the C++ type char16_t is represented by a base type entry
with a name attribute whose value is "char16_t", an encoding attribute
whose value is DW_ATE_utf and a byte size attribute whose value is 2.


Appendix
D.9 UTF character type examples
 
char16_t chr_a = u'h';
char32_t chr_b = U'h';

1$:  DW_TAG_base_type
         DW_AT_name("char16_t")
         DW_AT_encoding(DW_ATE_utf)
         DW_AT_byte_size(2)
2$:  DW_TAG_base_type
         DW_AT_name("char32_t")
         DW_AT_encoding(DW_ATE_utf)
         DW_AT_byte_size(4)
3$:  DW_TAG_variable
         DW_AT_name("chr_a")
         DW_AT_type(reference to 1$)
4$:  DW_TAG_variable
         DW_AT_name("chr_b")
         DW_AT_type(reference to 2$)

-- 
Revised history:
March 17, 2009 -- Add DW_ATE_utf to describe UTF encoding
April 20, 2009 -- Re-format DWARF description in Example section to 
                  match current documentation.
 


All logos and trademarks in this site are property of their respective owner.
The comments are property of their posters, all the rest © 2007-2017 by DWARF Standards Committee.