DWARF Standard


HOME
SPECIFICATIONS
FAQ
ISSUES



130211.1 Kendrick Wong Change default encoding of string literals to UTF8 Enhancement Rejected Kendrick Wong


Affected sections in DWARF spec: all sections where 'names' are used.

Overview:

Prior to DWARF 5, the default encoding of all string literals within DWARF sections are ASCII 
(ISO8859-1).  A compilation unit DIE may have an optional attribute, DW_AT_utf8, which indicates
that the encoding of all string literals is UTF8.

This proposal is to change the default encoding of all string literals to UTF8. (Thereby, deprecating 
DW_AT_utf8 in DWARF 5)

Proposal:

In DWARF 5, the default encoding of all string literals within DWARF is UTF8.  This affects all string 
literals within DWARF sections.  For example:
 - string literals in .debug_info
 - string literals in .debug_type
 - file, directory names in .debug_line
 - string literals in .debug_pubnames
 - string literals in .debug_pubtypes
 - string literals in .debug_str
 - etc..

While all the base ASCII characters share the same codepoints in UTF8, this is nevertheless an incompatible
change.  This is because existing producer may encode string literals using multibyte encodings, and these
literals will not be interpreted correctly under UTF8.

Once the default encoding is changed, the attribute DW_AT_utf8 is no longer needed, and can be deprecated.

---
Rejected 3/19/2013 -- Committee decided to retain existing default encoding.  
Producers are encouraged to use UTF-8 and to specify DW_AT_utf8 in the CU.


All logos and trademarks in this site are property of their respective owner.
The comments are property of their posters, all the rest © 2007-2017 by DWARF Standards Committee.