[Mono-dev] sse_mathfun convert

Sat Oct 17 10:15:06 EDT 2009

Your code does a lot of things in a non optimal way.
First, due to the calling convention mono uses, having a method call in
between Simd code
causes all simd variables to be spill on stack. This causes passing a
Vecto4f do be 2x slower than
passing a double since 16 bytes have to be pushed to the stack.

A few more comments inline.

On Thu, Oct 1, 2009 at 12:23 PM, jetthink <jetthink at gmail.com> wrote:

>
> Hi,
>  I have converted exp_ps(from http://gruntthepeon.free.fr/ssemath/) to
> Mono.
> using System;
> using Mono.Simd;
>
> public static class Myext{
>    public static unsafe Vector4i LogicalLeftShift(this Vector4i v1, int
> amount)
>    {
>        Vector4i res = new Vector4i();
>        int* a = (int*)&v1;
>        int* b =(int*)&res;
>        for (int i = 0; i < 4; ++i)
>            *b++ = (int)((uint)(*a++) << amount);
>        return res;
>    }
>

There are 2 issues here, first, by taking the address of a variable you ruin
any change the JIT
has to keep it on a register - this is a current limitation of mono's JIT.

Second, why can't you just use a regular vector shift? "v1 << amount" would
have the same result.

>    public static unsafe Vector4ui LogicalLeftShift(this Vector4ui v1, int
> amount)
>    {
>        Vector4ui res = new Vector4ui();
>        uint* a = (uint*)&v1;
>        uint* b =(uint*)&res;
>        for (int i = 0; i < 4; ++i)
>            *b++ = ((uint)(*a++) << amount);
>        return res;
>    }
>

Just use the left shift operation here "v1 << amount".

>    public static unsafe Vector4f Cast2Vector4f(this Vector4i v1)
>    {
>        Vector4f res = new Vector4f();
>        int* a = (int*)&v1;
>        float* b = (float*)&res;
>        for (int i = 0; i < 4; ++i)
>            *b++ = ((float)(*a++));
>        return res;
>    }
>

Here you're hitting a limitation of Mono.Simd as it lacks an intrinsic to
perform this conversion using
an SSE instruction. This have to be addressed.

>    static Vector4f v4sf_0p5 = new Vector4f(0.5f);
>    static Vector4ui v4sui_0x7f = new Vector4ui(0x7f);
>    static Vector4i v4si_0x7f = new Vector4i(0x7f);
>    static Vector4f v4sf_one = Vector4f.One;
>
>    static Vector4f v4sf_exp_hi = new Vector4f(88.3762626647949f);
>    static Vector4f v4sf_exp_lo = new Vector4f(-88.3762626647949f);
>    static Vector4f v4sf_cephes_LOG2EF = new Vector4f(1.44269504088896341f);
>    static Vector4f v4sf_cephes_exp_C1 = new Vector4f(0.693359375f);
>    static Vector4f v4sf_cephes_exp_C2 = new Vector4f(-2.12194440e-4f);
>
>    static Vector4f v4sf_cephes_exp_p0 = new Vector4f(1.9875691500E-4f);
>    static Vector4f v4sf_cephes_exp_p1 = new Vector4f(1.3981999507E-3f);
>    static Vector4f v4sf_cephes_exp_p2 = new Vector4f(8.3334519073E-3f);
>    static Vector4f v4sf_cephes_exp_p3 = new Vector4f(4.1665795894E-2f);
>    static Vector4f v4sf_cephes_exp_p4 = new Vector4f(1.6666665459E-1f);
>    static Vector4f v4sf_cephes_exp_p5 = new Vector4f(5.0000001201E-1f);
>    public static Vector4f ExpSSE(Vector4f x)
>    {
>        //Vector4f tmp =  Vector4f.Zero;
>        Vector4f fx = Vector4f.Zero;
>
>        Vector4i emm0;
>
>        x = VectorOperations.Min(x, v4sf_exp_hi);
>        x = VectorOperations.Max(x, v4sf_exp_lo);

You can use those methods as extension methods "x.Min(v4sf_exp_hi)"
It might be faster to explicitly create the constant instead of loading it
of a
static variable. This is specially true for single variable constructors.

Thanks for doing this experiment, there are certainly some remaining work
left to do on Mono.Simd and it does show it.

Rodrigo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.ximian.com/pipermail/mono-devel-list/attachments/20091017/f421b297/attachment.html