[Mono-bugs] [Bug 322164] ANSI strings are UTF-8 but should be in ANSI code page on Windows

bugzilla_noreply at novell.com bugzilla_noreply at novell.com
Fri May 16 07:48:00 EDT 2008


https://bugzilla.novell.com/show_bug.cgi?id=322164

User kornelpal at gmail.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=322164#c1


Kornél Pál <kornelpal at gmail.com> changed:

           What    |Removed                                         |Added
----------------------------------------------------------------------------
             Status|RESOLVED                                        |REOPENED
         OS/Version|All                                             |Windows
         Resolution|INVALID                                         |
            Summary|ANSI strings are UTF-8 but should be in native  |ANSI strings are UTF-8 but should be in ANSI
                   |encoding                                        |code page on Windows




--- Comment #1 from Kornél Pál <kornelpal at gmail.com>  2008-05-16 05:48:00 MST ---
I agree to map ANSI encoding to UTF-8 on operating systems other than Windows.

On Windows we should however use the ANSI encoding of the operating system.
ANSI code page is an operating system wide setting for compatibility with
legacy applications but Windows 9x lacked Unicode support and a lot NT
applications (even new developments) use the ANSI code page rather than
Unicode.

ANSI code page is returned by GetACP
(http://msdn.microsoft.com/en-us/library/ms776259.aspx) but CP_ACP can be used
for functions using code pages like WideCharToMultiByte.

Using UTF-8 on Windows instead of ANSI code page results in wrong character
conversions because string usually means string encoded in ANSI code on
Windows.

Also note that the runtime is affected as well. C runtime functions expect ANSI
code page while glib functions expect UTF-8. The way to go is to use UTF-8 for
char* and use the Unicode (UTF-16) versions of Windows API functions. The only
think that should be changed is to avoid C runtime functions that call Windows
API because the get an UTF-8 string but the C runtime converts that to UTF-16
using ANSI code page that even can result in security problems.


-- 
Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
You are the assignee for the bug.


More information about the mono-bugs mailing list