[Mono-dev] Mono.Simd AltiVec port

Thu Feb 11 18:44:00 EST 2010

The way to handle those situations is to have a arch decomposition pass that
converts MULPS into a VZERO + MULADD.
For bonus points, you can add to the arch peephole code to fuse MULPS +
ADDPS.

For an example of that, take a look at mini-x86.c /
mono_arch_decompose_opts.

Rodrigo

On Tue, Feb 9, 2010 at 11:57 AM, Sergei Dyshel <qyron.private at gmail.com>wrote:

> Hi,
> Now I'm stuck with another problem on PPC. For multiplication of floats
> Altivec has only a fuse-add instruction which does a*b+c. So in order to
> implement OP_MULPS I need to assure c==0. The only solution which comes to
> mind is:
> XZERO D
> MULADD D <= S1, S2, D
>
> Where MULADD is the instruction and D, S1, S2 are ins->dreg, sreg1, sreg2.
> But this solution won't work with cases in which S1=D or S2=D since D would
> be zeroed before use. So 2 possibilities remain:
> 1) Make sure that D <> S1 and D <> S2 and then previously-mentioned
> solution will work.
> 2) Allocate and additional (vector) register for MULPS and somehow store it
> inside MonoInst structure.
>
> What is the traditional way to do such things? I really need to solve this
> problem, any help will be greatly appreciated!
>
> Thanks,
> Sergei
>
>
> On Thu, Feb 4, 2010 at 02:59, Rodrigo Kumpera <kumpera at gmail.com> wrote:
>
>> Hi Sergei,
>>
>> On Tue, Feb 2, 2010 at 6:59 AM, Sergei Dyshel <qyron.private at gmail.com>wrote:
>>
>>> Hello all,
>>>
>>> I'm currently working on PowerPC port of Mono which utilizes AltiVec SIMD
>>> instructions. During the development I've encountered an alignment
>>> problem:
>>>
>>> As far as I understood from running Mono's JIT, stack-allocated
>>> Mono.Simd.Vector* types are always aligned by 16 byte bound, but global
>>> ones aren't (such as static class members). This is not a problem for SSE
>>> which has unaligned load/stores but AltiVec doesn't have them. Instead of
>>> implementing misaligned loads/stores for AltiVec I think it's better to
>>> force alignment in global variables, as it done in the case of stack.
>>>
>>
>> No, the JIT doesn't align all Vector types to 16 bytes. There are places,
>> like spill, code that
>> still doesn't do it correctly. Not a lot of work to get there, but still
>> not done.
>>
>>
>> If by global variables you mean statics, then making them properly aligned
>> is possible with some trickery.
>> The only issue alignment issue we can't currently fix are heap objects due
>> to how our GC works.
>> Our new GC might eventually gain the ability to properly align such
>> objects, but this is something
>> for the far future.
>>
>>
>>
>>> Can somebody help me with that (e.g. point at relevant places in
>>> 'mini-ppc.c')?
>>>
>>
>> To fix the alignment of stack variables you need to mess with a bunch of
>> places:
>>
>> -The spill code from mini-codegen.c
>> -The var allocation code in mono_allocate_stack_slots (mini.c)
>>
>> To fix the static storage alignment you need to change the code that
>> allocate the statics area
>> to use the proper alignment.
>>
>> This is the same problem as with objects as it uses a gc routine to
>> allocate the memory blob.
>> Fixing this requires boing deep into the GC, which is not something
>> simple.
>>
>>
>>
>
> _______________________________________________
> Mono-devel-list mailing list
> Mono-devel-list at lists.ximian.com
> http://lists.ximian.com/mailman/listinfo/mono-devel-list
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.ximian.com/pipermail/mono-devel-list/attachments/20100211/b88c6aec/attachment.html