[Mono-devel-list] StringBuilder patch

Torstensson, Patrik patrik.torstensson at intel.com
Mon Jan 12 15:47:01 EST 2004


This is true if the size of the buffers are the same, but that's not the
normal case. Remember that StringBuilder allocates a new string if the
current data in the string builder is 50% of capacity, that means if you
get one large file in your example string builder will work (almost) as
the old string builder.

But I agree that we can keep the current code and move it to another
namespace, are you up to it?

Cheers,
 Patri

-----Original Message-----
From: mono-devel-list-admin at lists.ximian.com
[mailto:mono-devel-list-admin at lists.ximian.com] On Behalf Of Ben Maurer
Sent: den 12 januari 2004 18:42
To: Torstensson, Patrik; lupus at ximian.com
Cc: mono-devel-list at lists.ximian.com
Subject: Re: [Mono-devel-list] StringBuilder patch

(I am at school, so i cant send from SF, can someone please get this on
the list).

One other thing we should do is to preserve the old source. The old
stringbuilder is actually VERY useful, and highly performant as a
`string buffer'. For example, if you allocate one static stringbuilder,
and use it multiple times, the cost of the extra slack on the end of
strings is more than what is lost by having the extra char [] buffer.

For example, in MCS you have to do something like:

static StringBuilder stringBuffer;

string ReadString (char ch) {
    do {
         if (ch == '\\')
              stringBuffer.Append (ReadEscapedChar ());
         stringBuffer.Append (ch);
    } while ((ch = ReadChar ()) != '"');
    string ret = stringBuffer.ToString ();
    stringBuffer.Length = 0;
    return ret;
}

In this case, it is MUCH more efficient to do it the old way (assuming
you have a pretty big number of strings.

So, it would be nice if we could publish the old source as a utility
class. We should rename it something like Mono.StringBuffer.

>>> Paolo Molaro <lupus at ximian.com> 01/12/04 11:31 AM >>>
On 12/12/03 Torstensson, Patrik wrote:
> diff -u -r1.108 object.h
> --- object.h	7 Dec 2003 13:12:09 -0000	1.108
> +++ object.h	12 Dec 2003 21:21:01 -0000
> @@ -112,9 +112,8 @@
>  
>  typedef struct {
>  	MonoObject object;
> -	gint32 capacity;
>  	gint32 length;
> -	MonoArray *chars;
> +	MonoString *str;
>  } MonoStringBuilder;
>  
>  typedef struct {
> Index: marshal.c
> ===================================================================
> RCS file: /cvs/public/mono/mono/metadata/marshal.c,v
> retrieving revision 1.130
> diff -u -r1.130 marshal.c
> --- marshal.c	9 Dec 2003 14:30:48 -0000	1.130
> +++ marshal.c	12 Dec 2003 21:21:05 -0000
> @@ -167,12 +167,12 @@
>  	l = strlen (text);
>  
>  	ut = g_utf8_to_utf16 (text, l, NULL, &items_written, &error);
> -
> -	if (items_written > sb->capacity)
> -		items_written = sb->capacity;
> +	
> +	if (items_written > sb->str->length)
> +		items_written = sb->str->length;

Add a comment there, something like:
	/* sb->str->length is the StringBuilder capacity */
or, maybe better, add a macro to the header file and use that:
#define mono_stringbuilder_capacity(sb) ((sb)->str->length)

> Index: System/String.cs
> ===================================================================
> RCS file: /cvs/public/mcs/class/corlib/System/String.cs,v
> retrieving revision 1.98
> diff -u -r1.98 String.cs
> --- System/String.cs	6 Dec 2003 16:54:59 -0000	1.98
> +++ System/String.cs	12 Dec 2003 23:43:26 -0000
[...]
> @@ -1197,7 +1197,26 @@
>  			ptr = p;
>  			return n;
>  		}
> -		
> +
> +		internal unsafe void InternalSetChar(int idx, char val) 
> +		{
> +			if ((UInt32) idx >= (UInt32) Length)

Use:
			if ((uint) idx >= (uint) length)

> diff -u -r1.26 StringBuilder.cs
> --- System.Text/StringBuilder.cs	17 Nov 2003 22:42:53
-0000	1.26
> +++ System.Text/StringBuilder.cs	12 Dec 2003 23:43:26 -0000
[...]
> -		public override string ToString() {
> -			if (thestring != null)
> -				return thestring;
> -			return (thestring = ToString(0, sLength));
> +		public override string ToString () 
> +		{
> +			if (_length == 0)
> +				return String.Empty;
> +
> +			if (null != _cached_str)
> +				return _cached_str;
> +
> +			// If we only have a half-full buffer we return
a new string.
> +			if (_length < (_str.Length / 2)) 

Use a shift directly here, instead of the division, so we don't have to
bother optimizing it.

> +			{
> +				_cached_str = _str.Substring(0,
_length);
> +				return _cached_str;
> +			}
> +
> +			_cached_str = _str;
> +			_str.InternalSetLength(_length);
> +
> +			return _str;
>  		}

Strings have the property that they are 0-terminated (so they can be
passed to P/Invoked methods without a copy). Maybe the best place to
ensure that is String.InternalSetLength(). The old impl got this for
free, since we always alloc strings with one char more than their
length, but in the stringbuilder there could be a non-zero char
left there from previous operations. 

Thanks. The patch can be committed (sorry it took a long time to review
it).

lupus

-- 
-----------------------------------------------------------------
lupus at debian.org                                     debian/rules
lupus at ximian.com                             Monkeys do it better
_______________________________________________
Mono-devel-list mailing list
Mono-devel-list at lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list

_______________________________________________
Mono-devel-list mailing list
Mono-devel-list at lists.ximian.com
http://lists.ximian.com/mailman/listinfo/mono-devel-list




More information about the Mono-devel-list mailing list