[Mono-dev] Mono.Simd - slower than the normal implementation
Alan McGovern
alan.mcgovern at gmail.com
Fri Nov 14 21:13:48 EST 2008
I found a bit of code in the SHA1 implementation which i thought was
ideal for SIMD optimisations. However, unless i resort to unsafe code,
it's actually substantially slower! I've attached three
implementations of the method here. The original, the safe SIMD and
the unsafe SIMD. The runtimes are as follows:
Original: 600ms
Unsafe Simd: 450ms
Safe Simd: 1700ms
Also, the method is always called with a uint[] of length 80.
Is this just the wrong place to be using simd? It seemed ideal because
i need 75% less XOR's. If anyone has an ideas on whether SIMD could
actually be useful for this case or not, let me know.
Thanks,
Alan.
The original code is:
private static void FillBuff(uint[] buff)
{
uint val;
for (int i = 16; i < 80; i += 8)
{
val = buff[i - 3] ^ buff[i - 8] ^ buff[i - 14] ^ buff[i - 16];
buff[i] = (val << 1) | (val >> 31);
val = buff[i - 2] ^ buff[i - 7] ^ buff[i - 13] ^ buff[i - 15];
buff[i + 1] = (val << 1) | (val >> 31);
val = buff[i - 1] ^ buff[i - 6] ^ buff[i - 12] ^ buff[i - 14];
buff[i + 2] = (val << 1) | (val >> 31);
val = buff[i + 0] ^ buff[i - 5] ^ buff[i - 11] ^ buff[i - 13];
buff[i + 3] = (val << 1) | (val >> 31);
val = buff[i + 1] ^ buff[i - 4] ^ buff[i - 10] ^ buff[i - 12];
buff[i + 4] = (val << 1) | (val >> 31);
val = buff[i + 2] ^ buff[i - 3] ^ buff[i - 9] ^ buff[i - 11];
buff[i + 5] = (val << 1) | (val >> 31);
val = buff[i + 3] ^ buff[i - 2] ^ buff[i - 8] ^ buff[i - 10];
buff[i + 6] = (val << 1) | (val >> 31);
val = buff[i + 4] ^ buff[i - 1] ^ buff[i - 7] ^ buff[i - 9];
buff[i + 7] = (val << 1) | (val >> 31);
}
}
The unsafe SIMD code is:
public unsafe static void FillBuff(uint[] buffb)
{
fixed (uint* buff = buffb) {
Vector4ui e;
for (int t = 16; t < buffb.Length; t += 4)
{
e = *((Vector4ui*)&(buff [t-16])) ^
*((Vector4ui*)&(buff [t-14])) ^
*((Vector4ui*)&(buff [t- 8])) ^
*((Vector4ui*)&(buff [t- 3]));
e.W ^= buff[t];
buff[t] = (e.X << 1) | (e.X >> 31);
buff[t + 1] = (e.Y << 1) | (e.Y >> 31);
buff[t + 2] = (e.Z << 1) | (e.Z >> 31);
buff[t + 3] = (e.W << 1) | (e.W >> 31) ^ ((e.X << 2) | (e.X >> 30));
}
}
}
The safe simd code is:
public static void FillBuff(uint[] buff)
{
Vector4ui e;
for (int t = 16; t < buff.Length; t += 4)
{
e = new Vector4ui (buff [t-16],buff [t-15],buff
[t-14],buff [t-13]) ^
new Vector4ui (buff [t-14],buff [t-13],buff
[t-12],buff [t-11]) ^
new Vector4ui (buff [t-8], buff [t-7], buff
[t-6], buff [t-5]) ^
new Vector4ui (buff [t-3], buff [t-2], buff
[t-1], buff [t-0]);
e.W ^= buff[t];
buff[t] = (e.X << 1) | (e.X >> 31);
buff[t + 1] = (e.Y << 1) | (e.Y >> 31);
buff[t + 2] = (e.Z << 1) | (e.Z >> 31);
buff[t + 3] = (e.W << 1) | (e.W >> 31) ^ ((e.X << 2) |
(e.X >> 30));
}
}
More information about the Mono-devel-list
mailing list