[Mono-dev] [PATCH] Add GetString to UnicodeEncoding 2.0 andmodifysome Encoding wrappers

Atsushi Eno atsushi at ximian.com
Tue Apr 11 12:56:11 EDT 2006


I'm not interested in how your patch accomplishes MS.NET compatibility.
My question is simple: is your patch *good* for Mono?

using System;
using System.Diagnostics;
using System.IO;
using System.Text;

public class Test
         public static void Main (string [] args)
                 int loop = args.Length > 1 ? int.Parse (args [1]) : 100;
                 string s = File.OpenText (args [0]).ReadToEnd ();
                 Encoding e = Encoding.Unicode;
                 Stopwatch sw = Stopwatch.StartNew ();
                 for (int i = 0; i < loop; i++)
                         e.GetBytes (s);
                 sw.Stop ();
                 Console.WriteLine (sw.ElapsedMilliseconds);

Before your patch:
mono ./unicode.exe ../../svn/mono/web/web/masterinfos/System.Web.xml

After the patch:
$ rundev2 mono ./unicode.exe 

Atsushi Eno

Kornél Pál wrote:
> Hi,
> I had some time and looked at all the encoding classes in I18N and in 
> System.Text.
> byte* and char* is only used in UnicodeEncoding and GetByteCount and 
> GetBytes in I18N.
> This means that having the #if NET_2_0 codes that you don't want to 
> remove will cause performance loss on profile 2.0 in System.Text while 
> will not improve performance in profile 1.1 as no such optimization is 
> done.
> The solution is to use arrays in Encoding that improves simple, old 
> fashioned encoding classes but override these methods to use pointers in 
> classes that implement their core functionality using unsafe code.
> Encodings in System.Text (except UnicodeEncoding) use arrays and I think 
> custom encodings created by users are array based as well so it results 
> in better performance if we use arrays in Encoding. If custom encodings 
> are using unsafe code they will have to override other methods because 
> of MS.NET anyway to get the performance improvement.
> By overriding GetByteCount (string) and GetBytes (string) in 
> MonoEncoding performance improvement on unsafe code will be preserved in 
> addition it will be available in all profiles.
> MonoEncoding was already good so I just added these two methods and 
> added the following code to GetBytes methods:
> int byteCount = bytes.Length - byteIndex;
> if (bytes.Length == 0)
> bytes = new byte [1];
> Some check is required because &bytes[0] will fail for zero-size arrays. 
> "bytes.Length == byteIndex" could avoid this (but was present in only 
> one of the methods) but this would prevent ArgumentException being 
> thrown for too small output buffers. Creating a small array is little 
> overhead and an exception will probably be thrown because charCount is 
> non-zero.
> Attached an improved patch. Please review the patch.
> Kornél

