[Mono-dev] [PATCH] Add GetString to UnicodeEncoding 2.0 and modifysome Encoding wrappers

Atsushi Eno atsushi at ximian.com
Mon Apr 10 08:48:27 EDT 2006


Hi,

Kornél Pál wrote:
> Hi,
> 
> Now I understood why is UnicodeEncodig.GetBytes(string) overridden in 
> MS.NET 1.x but not in MS.NET 2.0.

> Encoding of MS.NET uses char[] to convert strings in all versions and 
> the call an overload that takes char[] in GetBytes(string) as well. 
> (This is a difference compared to Mono as it uses char* in 2.0.) And I 
> think MS realized that the should make GetBytes(string) a higher level 
> wrapper just like the other ones and call GetBytes(string, int, int, 
> byte[], int) like the overridden method in UnicodeEncoding does.
> 
> But then they realized that this would break compatibility with MS.NET 
> 1.x so they dropped the modification done to Encodig.GetBytes(string) 
> but forgot to put back the override in UnicodeEncoding so 2.0 
> UnicodeEncodig.GetBytes(string) is actually less efficient than in 1.0.
> 
> I updated the patch to call the right method in 
> UnicodeEncodig.GetBytes(string).
> 
> Also note that Encoding of Mono is using the new unsafe methods for 
> GetBytes that takes string but MS.NET is not doing this optimization and 
> is using char[] instead that is more efficient when the new unsafe 
> methods are not overridden as they convert pointers back to arrays by 
> default. In addition calling the same methods improves compatibility.

Umm, I don't think that should be our way to go. Creating char[] in
GetBytes(string) is *obviously* inefficient. Since I actually added
pointer-based overrides in I18N classes, there is no 
"pointers-goes-back-to-arrays" problem. This kind of "compatibility"
change rather harms Mono performance.

Without exact, practical, clear and present danger of incompatiblity
problem, this change makes no sense to me. I'd love to make Mono not
suck in the name of compatibility which is anyways broken promise in
.NET land.

(You should also notice that Encoding implementation in 2.0 is totally
different than 1.x. In .NET 2.0 they are managed. In 1.x they are
just WIN32API wrappers. They are anyways incompatible, for example
having different fallback replacement characters.)

> (Note that all of these information were obtained by overriding Encoding 
> and printing notification to the console when a method is called.)
> 
> The updated patch is attached.

I wouldn't apply this new patch. If other mono hackers do that I won't
stop (but then it is very likely to happen that I stop helping
Encoding improvements anymore).

Atsushi Eno



More information about the Mono-devel-list mailing list