[Mono-dev] Current status of global regalloc (regalloc2.c)

Zoltan Varga vargaz at gmail.com
Tue Sep 14 14:15:18 EDT 2010


Hi,

  It would require lots of work for it to work correctly, besides porting it
to ppc.

                         Zoltan

On Tue, Sep 14, 2010 at 8:05 PM, Sergei Dyshel <qyron.private at gmail.com>wrote:

> Hi Zoltan,
> I've tried LLVM in the past and even remember making it work on PowerPC (to
> some extent). Of course it gives much better performance but it's too
> heavy-weight back-end for my project and, besides, there are other reasons
> why I can't use LLVM.
>
> So I have too stay with 'mini' :-). I've made some tweaks to it but still
> its division of vregs to local and global sometimes drastically decreases
> code performance by making a lot of redundant moves.
>
> So I started to think that may be global allocator is the only 'cure'. How
> many work, do you think, will it take to make it work on PowerPC for simple
> scenarios I described?
> --
> Regards,
> Sergei Dyshel
>
>
>
> On Tue, Sep 14, 2010 at 19:46, Zoltan Varga <vargaz at gmail.com> wrote:
>
>> Hi,
>>
>>   If you want better performance, I suggest looking at the the llvm
>> backend:
>>
>> http://www.mono-project.com/Mono_LLVM
>>
>>                    Zoltan
>>
>> On Tue, Sep 14, 2010 at 3:58 PM, Sergei Dyshel <qyron.private at gmail.com>wrote:
>>
>>> Hi,
>>>
>>> What is the current status of this feature? The regalloc2.c file wasn't
>>> substantially updated during last couple of years. Some citations from the
>>> file's comments section:
>>>
>>> Focus was on correctness and easy debuggability so *performance is bad*
>>>
>>>
>>> Bad related to what? I've tried both schemes, current vs globalra, on
>>> some computationally-intensive kernels and globally-allocated version seems
>>> always to run faster, even with x2-x3 speedups in case of floating-point
>>> kernels. Is this supposed behavior?
>>>
>>> Only works on amd64
>>>
>>>
>>> Is this true for now? I'm actually interested in 32-bit x86 and PowerPC
>>> back-ends. How much is required to make globalra work for them too? There
>>> are relatively few places where globalra is used in 'mini-amd64.c' so it
>>> doesn't seem hard to port these changes to 'mini-x86.c'.
>>>
>>> In my project I only need to execute very simple small computational
>>> kernels, with no arguments, no calls to another functions (hence no need to
>>> distinguish between callee/caller saved registers), only global static
>>> arrays are used?
>>>
>>> Any answers/comments are greatly appreciated!
>>> --
>>> Regards,
>>> Sergei Dyshel
>>>
>>> _______________________________________________
>>> Mono-devel-list mailing list
>>> Mono-devel-list at lists.ximian.com
>>> http://lists.ximian.com/mailman/listinfo/mono-devel-list
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.ximian.com/pipermail/mono-devel-list/attachments/20100914/395c7155/attachment-0001.html 


More information about the Mono-devel-list mailing list