[Mono-dev] Mono SIMD Function Declarations and Timing

Thu Nov 20 21:13:01 EST 2008

Hello,

I wrote a small model editor called Golem3D.  Coded in C#, it uses OpenTK
for rendering using OpenGL.  I helped with some of the math library for
OpenTK and performed timing tests on the different function declarations.

Example:
  5.80    public static Vector4D operator+ (Vector4D v1, Vector4D v2);
  ----    static Vector4D Add(Vector4D v1, Vector4D v2);
  5.50    static Vector4D Add(ref Vector4D v1, ref Vector4D v2);
  ----    static void Add(Vector4D v1, Vector4D v2, out result);
  1.17    static void Add(ref Vector4D v1, ref Vector4D v2, out result);
  1.25    void Add(Vector4D v);
  1.00    void Add(ref Vector4D v);
  ----    void Add(Vector4D v, out Vector4D result);
  ----    void Add(ref Vector4D v, out Vector4D result);

It seems that most of the overhead is in the copying of the structure. 
Directly modifying a vector and passing in arguments with ref was the
fastest implementation.  (Tested on Windows with .NET)

While passing by ref is faster, sometimes you want to be able to call
"a.Add(new Vector(1,1,1,1))" which does not work with ref.  So I am of the
opinion both declarations with parameters as ref and parameters without ref
are needed.

Sometimes you want to say "a = b + c".  In the tests, this was the slowest
of all the declarations.  I do think it is beneficial to have when used in
non-repetitive tasks.

How is this affected by the SIMD?  It looks like the generated assembly from
the PDC slides gets rid of much of overhead from the copying of structure. 
So in the case of Mono, this may not be an issue.  If you run the same code
under .NET (which currently does not have SIMD support), it would matter.

I would like to see timing tests of the different declarations with and
without SIMD.  Also I'm only showing the "Add" function here, and other
functions may accelerate further with SIMD.

Other Thoughts...

It would be nice to have a Vector3f since when using OpenGL you send a
buffer of Vectors and with Vector4f you have a third more data to send to
the graphics card.  Not sure if this is doable, or if SIMD must work on a
full 128 bits.

I think I saw that SIMD only supported 128 bits, so a Vector of 3 doubles
would not work.  Is there a way to make a Vector of 3 double or 4 doubles
that uses two SIMD instructons? i.e. like a Vector4D was actually 2
Vector2Ds? (I know there are some issues, but thoughts?)

While it is nice to have Properties for the variables in the vector, it
would be nice to have the variables exposed.  This is needed since calling
into the OpenGL functions would work cleaner with GL.Normal3(ref vector.X).

Matrices and Quaternions?  If matrices are implemented, I'd like to see them
compatible with OpenGL's matrix format.

Sorry if I covered anything that has been brought up before.  I searched all
the SIMD threads and did not see any of this covered.

I plan to get the SIMD accelerated library and test some things out when I
find the time.  If there is anything I can do to help, from testing to
reviewing interfaces to documentation, let me know.

James
-- 
View this message in context: http://n2.nabble.com/Mono-SIMD-Function-Declarations-and-Timing-tp1559946p1559946.html
Sent from the Mono - Dev mailing list archive at Nabble.com.