Monday, 15 February 2010

random - Avoiding Exact Repeats for Mersenne Twister in Python -



random - Avoiding Exact Repeats for Mersenne Twister in Python -

as many people know, python uses mersenne twister (mt) algorithm handle random numbers. however, despite having long period (~2^19937), well-known can't reach every random permutation when shuffle sequence greater 2080 elements (since !2081 > 2^19937). dealing permutations , statistical properties of import me, i'm trying figure out best way mix or re-seed python generator additional source of randomness avoid repetition.

currently, concept utilize scheme random number generator (systemrandom) add together external source of randomness mt generator. there 2 ways can think of this:

xor systemrandom random number mt random number use systemrandom reseed mt

the first approach used frequency hardware random number generators, cut down bias tendencies. however, it's wildly inefficient. on windows xp machine, systemrandom 50 times slower standard python random function. that's huge performance nail when of function involves shuffling. given that, reseeding mt systemrandom should more efficient.

however, there 2 issues approach also. firstly, reseeding mt during operation might disrupt statistical properties. i'm shouldn't issue if mt runs long enough, each run of mt values should well-formed (regardless of starting point). indicate sizable period between mt reseeding preferred. secondly, there question of efficient way trigger reseeding. simplest way handle counter. however, more efficient ways might possible.

so then, there 3 questions on point:

has read effect reseeding mt random value after every n samples alter desirable statistical properties? does know more efficient way incrementing counter trigger reseed? finally, if knows improve way approach problem, i'm ears.

reseeding not help you. jump somewhere else (very very) long mt sequence. are-you sure shuffling info gives biased result? because never have plenty hours in universe lifetime generate possible sequences. if know sequences can perchance never generated, doesn't mean generated sequences biased. guess best bet utilize shuffle command it.

if @ numpy.random.shuffle source code (line 4376), here algorithm used (i simplified clarity):

i = len(x) - 1 while > 0: j = randint(0, i) x[i], x[j] = x[j], x[i] = - 1

in other words, origin end, swaps value random value taken randomly before in array, until values swapped. final state not depends on random generator, on initial state of array. means in theory, should able visit permutations, if perform plenty shuffles.

python random seeding mersenne-twister

No comments:

Post a Comment