[Mono-dev] How To Convince Mono To Allocate Big Arrays

Rodrigo Kumpera kumpera at gmail.com
Wed May 7 22:50:57 EDT 2008


Hi Luis,

To have your patch integrated, some changes are needed. First, we want to
default to 32bits sized arrays on 64bits machines, so your changes must be
conditionally compiled. To help with that some changed to your patch are
due. Next are some comments about it:

Instead of replacing guint32 for gsize, it's better to create a new type,
let's say array_size_t. This would reduce conditional compilation to fewer
places.

 /* helper macros to check for overflow when calculating the size of arrays
*/
-#define MYGUINT32_MAX 4294967295U
+#if (GLIB_SIZEOF_SIZE_T < 4 )
+#define MYGUINT32_MAX 0xFFFFFFFFUL
+#define MYGUINT_MAX MYGUINT32_MAX

This #if seens bogus, don't you mean "if ((GLIB_SIZEOF_SIZE_T == 4 )" as
mono never supported 16bits machines.
The macros can be unified by using MYGUINT_MAX and the 'array_size_t' type I
talked before. The definition of MYGUINT_MAX
should be put together in the same place we define 'array_size_t'. And we
could go with a meaningful name, don't you think?


-    if (CHECK_MUL_OVERFLOW_UN (n, elem_size))
-        mono_gc_out_of_memory (MYGUINT32_MAX);
+    if (CHECK_MUL_OVERFLOW_UN (n, elem_size)) {
+        g_print("CHECK_MUL_OVERFLOW_UN(%zd,%zd) failed\n",n, elem_size);
+        mono_gc_out_of_memory (MYGUINT_MAX);
+        }

If you find that keeping such debug code is really important, you should
follow the same pattern of the rest of the project. Take a look at how
DEBUG_IMT is used on object.c.


@@ -3548,34 +3559,30 @@
     /* A single dimensional array with a 0 lower bound is the same as an
szarray */
     if (array_class->rank == 1 && ((array_class->byval_arg.type ==
MONO_TYPE_SZARRAY) || (lower_bounds && lower_bounds [0] == 0))) {
         len = lengths [0];
-        if ((int) len < 0)
-            arith_overflow ();


Why are you removing overflow checks here?


@@ -562,6 +607,26 @@
     if (this->bounds == NULL)
         return this->max_length;

+    length = this->bounds [dimension].length;
+    if (length > G_MAXINT32)
+            mono_raise_exception (mono_get_exception_overflow ());
+
+    return length;
+}

Why throwing an exception here? I'm not sure this is the way to go,
unfortunately this is an area underspecified on ecma. Not that truncating is
a good option either.

Changes to amd64 code I'll leave to Zoltan.


Cheers,
Rodrigo


2008/5/5 Luis F. Ortiz <LuisOrtiz at verizon.net>:

> Back in September ("Big Arrays, Many Changes --- Request for Advice") I
> asked folks for advice on how to go about adding the capability to Mono to
> allocate arrays with more than Int32.MaxValue elements.  After playing
> around with it for a few months, I'm at the point where I have an
> implementation that mostly works, with a couple of warts which could
> probably be quickly fixed by someone who knows more than I do about Mono
> internals. I spoke with Miguel about these patches, and he encouraged me to
> post them to mono-dev as soon as I got them to pass the "make check" test
> suite.  So here I am a week later.
> I want to start by going over the changes themselves, what alternatives
> there might be to what I had done and what flaws I know to exist in the
> implementation.
>
> First off, though Microsoft chose for some reason NOT to implement large
> array allocation, the necessary APIs are in the .NET specification.  For
> example, in the System.Array class, we find:
>
> public long GetLongLength<http://msdn.microsoft.com/en-us/library/system.array.getlonglength.aspx>(int
> dimension);
> public long LongLength<http://msdn.microsoft.com/en-us/library/system.array.longlength.aspx> {
> get; }
> public static Array CreateInstance<http://msdn.microsoft.com/en-us/library/1z8w3at5.aspx>(Type
> elementType, params long[] lengths);
> public Object GetValue<http://msdn.microsoft.com/en-us/library/2zexc3z9.aspx>(long
> index);
> public void SetValue<http://msdn.microsoft.com/en-us/library/czx562xz.aspx>(Object
> value, long index);
> ... (other overloads omitted, but there)
> and we find that the Newarr<http://msdn.microsoft.com/en-us/library/system.reflection.emit.opcodes.newarr%28VS.71%29.aspx>  opcode
> takes a natural int or an Int32 as the length, so the bytecode level is
> ready too.
>
> Mono as of 1.2.6 already implemented most (all?) of the necessary
> interfaces in mcs/class/corlib/System/Array.cs, but they all cast down their
> long arguments down to integers as the underlying implementation was not
> there.
>
> So the first set of changes were to:
> * **mono/metadata/object.c*
> * **mono/metadata/object.h*
> * **mono/metadata/icall-def.h*
> * **mono/metadata/icall.c*
> * **mono/metadata/socket-io.c*
>
> In object.h I made three changes:
>
> 1) Changed MonoArrayBounds to use gsize instead of guint32 as the type for
> length and lower_bound,
> 2) Changed MonoArray to use gsize instead of guint32 as the type of
> max_length,
> 3) Changed the prototypes for mono_array_new(), mono_array_new_full(), and
> mono_array_new_specific() to
> take gsize's instead of guint32's for their size and bounds parameters.
> I.e.:
> MonoArray*
> -mono_array_new (MonoDomain *domain, MonoClass *eclass, guint32 n);
> +mono_array_new (MonoDomain *domain, MonoClass *eclass, gsize n);
>
>   MonoArray*
>  mono_array_new_full        (MonoDomain *domain, MonoClass *array_class,
> -                            guint32 *lengths, guint32 *lower_bounds);
> +                            gsize *lengths, gsize *lower_bounds);
>
>   MonoArray *
> -mono_array_new_specific            (MonoVTable *vtable, guint32 n);
> +mono_array_new_specific            (MonoVTable *vtable, gsize n);
>
> This is the first place an alternative shows up.  ¿Which type is better:
> gsize or gssize?  The unsigned type gsize better matches the type memory
> allocation functions use (size_t or some variant) and the existing guint32,
> but the signed type better matches the interface presented at the top level
> (i.e. CreateInstance).  I chose the unsigned alternative, but an argument
> could be made for the signed type.  Another alternative would be to create
> 64 bit versions of the mono_array_new(), mono_array_new_full(), and
> mono_array_new_specific() functions, but that seemed to be too much work.
>
> The changes in *object.c* add the implementations of the
> modified mono_array_new(), mono_array_new_full(), and
> mono_array_new_specific() functions.  There was some confusing #defines
> around MYGUINT32_MAX that I did not like, but rather than replace that
> cruft, I extended it.
>
> The changes in *icall-def.h* add two new method calls to the array
> object, CreateInstanceImpl64() and GetLongLength().  It might be possible to
> avoid the CreateInstanceImpl64() definition and make it an overload
> of CreateInstanceImpl() with long parameters, if I was sure on how to do
> that.
>
> The changes to *icall.c* tweak the implementation
> of ves_icall_System_Array_CreateInstanceImpl() to change the type of the
> sizes array and add the implementations
> of ves_icall_System_Array_CreateInstanceImpl64()
> and ves_icall_System_Array_GetLongLength().
>
> The change to *socket-io.c* was to tweak its usage of mono_array_new_full
> to use gssize instead of int for for the array of lengths.
>
> So all these changes get us to the point where the basic foundation
> is laid down, but there is still the JIT to contend with.  It requires a few
> more files to be changed:
> * **mono/mini/mini.c*
> * **mono/mini/jit-icalls.c*
> * **mono/mini/exceptions.cs*
>
> The changes in *mini.c* change the signature of mono_array_new
> and mono_array_new_specific to take "int" instead of "int32".
>
> The changes in *jit-calls.c* do the boring change of guint32's into
> gsize's.
>
> The changes in *exceptions.cs* split the test case for test_0_array_size
> into a 32 and 64 bit variant, because an allocation of Int32.MaxValue can
> succeed after these changes are applied.
>
> There was only one touch-up needed in the the C# compiler:  the GetLength
> property code special-inlined code-generation needed to be tweaked since it
> is now possible to get an array length that will not fit into an I4.
>  Changing *mcs/mcs/ecore.cs* and *mcs/mcs/expression.cs* to use *
> OpCodes.Conv_Ovf_I4* after *OpCodes.Ldlen* instead of *OpCodes.Conv_I4*  fixed
> that.
>
> Oh, yeah, and ALL the long method calls in *
> mcs/class/corlib/System/Array.cs* needed to be converted over to
> use CreateInstanceImpl64() and GetLongLength().  The SetValue() and
> GetValue() implementations still need work, but since there are no unit
> tests for those methods, I put them off.
>
> That gets us to the point where we can allocate a large array, but it does
> not let us index a large array.   I changed the following files to start the
> process of converting the indexing operations to do bounds checking against
> the now 32/64 bit length of arrays and to index using a 64/32 bit index:
> * **mono/mini/inssel-amd64.brg*
> * **mono/mini/mini-amd64.c*
> * **mono/mini/mini-ops.h*
> * **mono/mini/cpu-amd64.md*
>
> In *inssel-amd64.brg*, I changed the
> macro MONO_EMIT_NEW_AMD64_ICOMPARE_MEMBASE_REG to use a new
> opcode OP_AMD64_COMPARE_MEMBASE_REG_I8 instead
> of OP_AMD64_ICOMPARE_MEMBASE_REG and hacked CEE_LDELEMA to no longer use
>  the faster OP_X86_LEA because I was not sure how to generalize it.
> Perhaps MONO_EMIT_NEW_AMD64_ICOMPARE_MEMBASE_REG should be retired and
> explicit I4 and I8 versions substituted where appropriate.
>
> In *mini-amd64.c* I added the OP_AMD64_COMPARE_MEMBASE_REG_I8 opcode as
> being the same as OP_AMD64_ICOMPARE_MEMBASE_REG, except with an operand size
> of 8 bytes instead of 4.
>
> In *mini-ops.h* and *cpu-amd64.md* I added an entries
> for OP_AMD64_COMPARE_MEMBASE_REG_I8 and amd64_compare_membase_reg_i8.
>
> These changes seem to get 64 bit indexing working, and passed all the
> regression tests in 1.2.6, but in 1.9.1 a regression test
> called test_0_regress_75832() breaks.  It could be that the changes I made
> in mono/mini are incorrect.  I am sure the changes are incomplete, and I
> have not considered what to do to other architectures.
>
> Advice or assistance is greatly appreciated.
>
> Luis F. Ortiz
> Official Mono Modifier
> Interactive Supercomputing,  Inc.
>
> PS:  Here are the changes proper:
>
>
>
>
>
> _______________________________________________
> Mono-devel-list mailing list
> Mono-devel-list at lists.ximian.com
> http://lists.ximian.com/mailman/listinfo/mono-devel-list
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.ximian.com/pipermail/mono-devel-list/attachments/20080507/5eb7952e/attachment-0001.html 


More information about the Mono-devel-list mailing list