[Mono-dev] Some strange behavior of optimization options for JIT
Sergey Tikhonov
tsv at solvo.ru
Mon Dec 25 11:11:56 EST 2006
Sergey Tikhonov wrote:
>Hello,
>
>I am experiencing some strange problems when run mini tests with
>--optimize=all option. Here is example:
>method to IR System.Collections.ArrayList:Add (object)
>converting (in B2: stack: 0) IL_0000: ldarg.0
>converting (in B2: stack: 1) IL_0001: ldfld 0x04000412
>converting (in B2: stack: 1) IL_0006: ldlen
>converting (in B2: stack: 1) IL_0007: conv.i4
>converting (in B2: stack: 1) IL_0008: ldarg.0
>converting (in B2: stack: 2) IL_0009: ldfld 0x04000411
>converting (in B2: stack: 2) IL_000e: bgt IL_0021
>converting (in B5: stack: 0) IL_0013: ldarg.0
>converting (in B5: stack: 1) IL_0014: ldarg.0
>converting (in B5: stack: 2) IL_0015: ldfld 0x04000411
>converting (in B5: stack: 2) IL_001a: ldc.i4.1
>converting (in B5: stack: 3) IL_001b: add
>converting (in B5: stack: 2) IL_001c: call 0x06001445
>mono_arch_get_inst_for_method: EnsureCapacity
>ALPHA: Will call Managed method with 1(1) params. RetType:
>MONO_TYPE_VOID(0x1)
>ALPHA: Param[0] - simple
>ALPHA: Param[1] - simple
>converting (in B4: stack: 0) IL_0021: ldarg.0
>converting (in B4: stack: 1) IL_0022: ldfld 0x04000412
>
>Here is IR before optimizations:
>
>SSAPRE STARTS PROCESSING METHOD System.Collections.ArrayList:Add
>(object)
>BEFORE SSAPRE START
>CODE BLOCK 3 (nesting 0):
> (stind.i4 local[2] iconst[0])
> (bgt[B6B5] (compare (long_conv_to_i4 (ldlen (ldind.ref (long_add
>(ldind.i arg[0]) iconst[24])))) (ldind.i4 (long_add (ldind.i arg[0])
>iconst[16]))))
>CODE BLOCK 5 (nesting 0):
> (outarg_reg (add (ldind.i4 (long_add (ldind.i arg[0]) iconst[16]))
>iconst[1]))
> (outarg_reg (ldind.i arg[0]))
> voidcall[EnsureCapacity]
> br[B4]
>CODE BLOCK 4 (nesting 0):
>
>Here is after optimizations:
>SSAPRE ENDS PROCESSING METHOD System.Collections.ArrayList:Add (object)
>remove_block_if_useless System.Collections.ArrayList:Add (object),
>removed BB6
>br removal triggered 5 -> 4
>BEFORE DECOMPSE START
>CODE BLOCK 3 (nesting 0):
> (stind.i4 local[2] iconst[0])
> (stind.i local[9] (long_add (ldind.i arg[0]) iconst[16]))
> (stind.i local[8] (long_add (ldind.i arg[0]) iconst[24]))
> (bgt[B4B5] (compare (long_conv_to_i4 (ldlen (ldind.ref (ldind.i
>local[8])))) (ldind.i4 (ldind.i local[9]))))
>CODE BLOCK 5 (nesting 0):
> (outarg_reg (add (ldind.i4 (ldind.i local[9])) iconst[1]))
> (outarg_reg (ldind.i arg[0]))
> voidcall[EnsureCapacity]
> nop
>CODE BLOCK 4 (nesting 0):
>
>As we see two locals were introduced to hold "sum" results. In
>"mono_arch_allocate_vars" the "mono_allocate_stack_slots_full" is called
>to allocate locals and calculate offsets for them. Somehow it doesn't do
>it:
>ALPHA: Locals start offset is 16(10)
>ALPHA: Locals size is 24(18)
>ALPHA: allocated local 2 to regoffset[0x1c(alpha_r15)]
>ALPHA: allocated local 3 to regoffset[0x18(alpha_r15)]
>ALPHA: allocated local 4 to regoffset[0x10(alpha_r15)]
>ALPHA: allocated local 8 to regoffset[0x20(alpha_r15)]
>ALPHA: allocated local 9 to regoffset[0x20(alpha_r15)]
>ALPHA: allocated local 11 to regoffset[0x20(alpha_r15)]
>ALPHA: reg_save_area_offset at 40(28)
>ALPHA: args_save_area_offset at 40(28)
>ALPHA: Stack size is 56(38)
>DUMP BLOCK 0:
>DUMP BLOCK 3:
> (stind.i4 regoffset[0x1c(alpha_r15)] iconst[0])
> (stind.i regoffset[0x20(alpha_r15)] (long_add (ldind.i
>regoffset[0x28(alpha_r15)]) iconst[16]))
> (stind.i regoffset[0x20(alpha_r15)] (long_add (ldind.i
>regoffset[0x28(alpha_r15)]) iconst[24]))
> (bgt[B4B5] (compare (long_conv_to_i4 (ldlen (ldind.ref (ldind.i
>regoffset[0x20(alpha_r15)])))) (ldind.i4 (ldind.i
>regoffset[0x20(alpha_r15)]))))
>DUMP BLOCK 5:
>
>The local vars 8,9,11 are in the list of local vars, but assigned
>offsets are the same. And size allocated for locals is less than
>needed. :(
>
>I checked the other arch sources and don't see any magic I have to do.
>Any hints? I guess it should allocate it.
>
>
After some investigations I've found that this is SSAPRE optimization
that fails. Suprisely it does work on AMD64,
although there should not be any arch dependancy. The new temporary
registers "created" by ssapre optimizer don't
get correct live info in registry live pass in liveness.c. I can't
understand why. Locals 8,9 should live till compare instruction.
Here is liveness dump:
START 11 ffff0000 00000000
START 9 ffff0000 00000000
EXPIR 11 ffff0000 00000000 C0 R-1
START 8 ffff0000 00000000
EXPIR 9 ffff0000 00000000 C0 R-1
START 2 00010000 00030009
START 3 00030005 00030008
START 4 00030006 00030008
Any hints where to look at?
Thank you,
--
Sergey Tikhonov
Head, R&D department
Solvo Ltd.
Saint-Petersburg, Russia
http://www.solvo.ru
tsv at solvo.ru
More information about the Mono-devel-list
mailing list