[Mono-dev] Issues with System.Random

Tue Mar 16 12:28:17 EDT 2010

Andreas Nahr wrote:
> I won't comment on the algorithm itself (keep in mind that the existing one
> already was replaced once with a "better" one which failed miserably in real
> world apps, so had to be reverted).

I tested a sequence of 68 million 32-bit values for randomness using the Diehard test suite. Of course this is only a 
heuristic indicator that the sequence has good random characteristics, not a proof. However, the current implementation 
does not pass (some of) those tests, i.e. there exist cases where it exhibits bad random characteristics.

> Also a newsgroup as source doesn't sound reliable at all.

Indeed, but the algorithm is based on the paper "Xorshift RNGs" by George Marsaglia, published in Journal of Statistical 
Software, 2003; http://www.jstatsoft.org/v08/i14/paper .

> But your patch adds errors for exceptions (which were mostly correct
> before).

What errors are you referring to? As far as I can see, all exceptions mandated by the specification of System.Random are 
thrown (and tested).
By the way, I forgot the "Locale.GetText()" for the exception messages.

> And the unit-tests are no "unit-tests". They don't test the
> implementation against the specification, they test the implementation
> against the implementation which is useless.

I disagree. The expected values were not generated by my implementation, but by this reference implementation of the 
algorithm (compiled for 32-bit machines):
static unsigned long x=123456789,y=362436069,z=521288629,w=88675123,v=886756453;
unsigned long xorshift(void)
{
  unsigned long t=(x^(x>>7)); x=y; y=z; z=w; w=v;
  v=(v^(v<<6))^(t^(t<<13)); return (y+y+1)*v;
}

You could insist on calculating the expected values using pen and paper with binary vectors and matrices of size 
160x160, but it is pretty simple to proof that xorshifts correspond to the addition of an identity matrix to an 
L^a/R^a-matrix (the theory behind this RNG), so comparing against the output of the reference function should be 
sufficient to show that the class correctly implements the algorithm.

As for showing that the algorithm itself fulfills the specification: you can proof that the output is uniformly 
distributed, and you can run statistical tests to make sure (to a certain degree) that the algorithm behaves in a random 
way. There is nothing more you can do (except for running more statistical test suites on longer outputs).

The old unit tests are actually rather poor: they don't test for a uniform distribution; the only test for the expected 
value of the random variable is commented out because it is unreliable (NextDouble ()); and no statistical properties 
are tested. None of that can actually be done well using unit tests, so the best way I can think of is to extensively 
test the algorithm, and then compare the implementation against the algorithm.
If you know a better approach, let me know :-)

> And moreover you removed ALL
> Random() constructor tests which most likely are the only of relevance to
> real-world applications.

Yes, I forgot this one. However, there's not much you can test for (except that it doesn't throw an exception): the 
state is private, so it can't be checked directly; the value of Environment.TickCount() might change between reading it 
and calling new Random(); and it is so simple, that it hardly does anything. So how about
[Test]
public void Constructor1()
{
	var rng = new Random();
	for(var i = 0; i < 100; ++i)
		Assert.That(rng.Next(), Is.GreaterThanOrEqualTo(0));
}

Best regards,
Adrian