[Mono-dev] [PATCH] Add GetString to UnicodeEncoding 2.0 andmodifysome Encoding wrappers

Kornél Pál kornelpal at gmail.com
Mon Apr 10 09:53:02 EDT 2006


Now I haven't got enough time for a detailed reply (I'll later do that) but 
I have the following general opinion:

For a public abstract base class like Encoding I think it's important to use 
the same wrappers as MS.NET does because people probably assume this. And 
their code will behave differently on Mono otherwise if the only override 
some methods.

I18N encodings however are totally internal so nobody can override them (if 
obtained using GetEncoding) so they can do whatever we want.

I saw that you use pointer based conversion in MonoEncoding that is good and 
fast and should not be reverted and is not affected by the bad performance 
of Encoding. I think Encoding should be MS.NET compatible but MonoEncoding 
can override any methods of Encoding that are inefficient.

So I think we should "retard" Encoding to provide compatibility and update 
(or leave if it's already good enough) MonoEncoding for better performance.

Note that Encoding in MS.NET 2.0 is just as inefficient (if not more 
inefficient) as in 1.x but they created their own EncodingNLS (I guess this 
is that class) to address this problem. We could do the same using 
MonoEncoding.

Type type = Encoding.GetEncoding(1252).GetType();
while (type != null)
{
 Console.WriteLine(type.FullName);
 type = type.BaseType;
}

System.Text.SBCSCodePageEncoding
System.Text.BaseCodePageEncoding
System.Text.EncodingNLS
System.Text.Encoding
System.Object

Kornél

----- Original Message ----- 
From: "Atsushi Eno" <atsushi at ximian.com>
Cc: <mono-devel-list at lists.ximian.com>
Sent: Monday, April 10, 2006 2:48 PM
Subject: Re: [Mono-dev] [PATCH] Add GetString to UnicodeEncoding 2.0 
andmodifysome Encoding wrappers


> Hi,
>
> Kornél Pál wrote:
>> Hi,
>>
>> Now I understood why is UnicodeEncodig.GetBytes(string) overridden in 
>> MS.NET 1.x but not in MS.NET 2.0.
>
>> Encoding of MS.NET uses char[] to convert strings in all versions and the 
>> call an overload that takes char[] in GetBytes(string) as well. (This is 
>> a difference compared to Mono as it uses char* in 2.0.) And I think MS 
>> realized that the should make GetBytes(string) a higher level wrapper 
>> just like the other ones and call GetBytes(string, int, int, byte[], int) 
>> like the overridden method in UnicodeEncoding does.
>>
>> But then they realized that this would break compatibility with MS.NET 
>> 1.x so they dropped the modification done to Encodig.GetBytes(string) but 
>> forgot to put back the override in UnicodeEncoding so 2.0 
>> UnicodeEncodig.GetBytes(string) is actually less efficient than in 1.0.
>>
>> I updated the patch to call the right method in 
>> UnicodeEncodig.GetBytes(string).
>>
>> Also note that Encoding of Mono is using the new unsafe methods for 
>> GetBytes that takes string but MS.NET is not doing this optimization and 
>> is using char[] instead that is more efficient when the new unsafe 
>> methods are not overridden as they convert pointers back to arrays by 
>> default. In addition calling the same methods improves compatibility.
>
> Umm, I don't think that should be our way to go. Creating char[] in
> GetBytes(string) is *obviously* inefficient. Since I actually added
> pointer-based overrides in I18N classes, there is no 
> "pointers-goes-back-to-arrays" problem. This kind of "compatibility"
> change rather harms Mono performance.
>
> Without exact, practical, clear and present danger of incompatiblity
> problem, this change makes no sense to me. I'd love to make Mono not
> suck in the name of compatibility which is anyways broken promise in
> .NET land.
>
> (You should also notice that Encoding implementation in 2.0 is totally
> different than 1.x. In .NET 2.0 they are managed. In 1.x they are
> just WIN32API wrappers. They are anyways incompatible, for example
> having different fallback replacement characters.)
>
>> (Note that all of these information were obtained by overriding Encoding 
>> and printing notification to the console when a method is called.)
>>
>> The updated patch is attached.
>
> I wouldn't apply this new patch. If other mono hackers do that I won't
> stop (but then it is very likely to happen that I stop helping
> Encoding improvements anymore).
>
> Atsushi Eno
> _______________________________________________
> Mono-devel-list mailing list
> Mono-devel-list at lists.ximian.com
> http://lists.ximian.com/mailman/listinfo/mono-devel-list 




More information about the Mono-devel-list mailing list