Hee: algorithm - Transform a set of large integers into a set of small ones -

algorithm - Transform a set of large integers into a set of small ones -

how recode set of strictly increasing (or strictly decreasing) positive integers p, decrease number of positive integers can occur between integers in our set?

why want this: want randomly sample p 1.) p big enumerate, , 2.) members of p related in nonrandom way, in way complicated sample by. however, know fellow member of p when see it. know p[0] , p[n] can't entertain thought of enumerating of p or understanding exactly how members of p related. likewise, number of possible integers occurring between p[0] , p[n] many times greater size of p, making chance of randomly drawing fellow member of p small.

example: allow p[0] = 2101010101 & p[n] = 505050505. now, maybe we're interested in integers between p[0] , p[n] have specific quality (e.g. integers in p[x] sum q or less, each fellow member of p has 7 or less largest integer). so, not positive integers p[n] <= x <= p[0] belong p. p i'm interested in discussed in comments below.

what i've tried: if p strictly decreasing set , know p[0] , p[n], can treat each fellow member if subtracted p[0]. doing decreases each number, perhaps , maintains each fellow member unique integer. p i'm interested in (below), 1 can treat each decreased value of p beingness divided mutual denominator (9,11,99), decreases number of possible integers between members of p. i've found used in conjunction, these approaches decrease set of p[0] <= x <= p[n] few orders of magnitude, making chance of randomly drawing fellow member of p positive integers p[n] <= x <= p[0] still small.

note: should clear, have know p. if don't, means have no clue of we're looking for. when randomly sample integers between p[0] , p[n] (recoded or not) need able "yup, belongs p.", if indeed does.

a reply increment practical application of computing algorithm have developed. illustration of kind of p i'm interested in given in comment 2. adamant giving due credit.

while original question asking generic scenario concerning integer encodings, suggest unlikely there exists approach works in finish generality. example, if p[i] more or less random (from information-theoretic standpoint), surprised if should work.

so, instead, allow turn our attending op's actual problem of generating partitions of integer n containing k parts. when encoding combinatorial objects integers, behooves preserve much of combinatorial construction possible. this, turn classic text combinatorial algorithms nijenhuis , wilf, chapter 13. in fact, in chapter, demonstrate framework enumerate , sample number of combinatorial families -- including partitions of n largest part equal k. using well-known duality between partitions k parts , partitions largest part k (take transpose of ferrers diagram), find need create alter decoding process.

anyways, here's source code:

import sys import random import time  if len(sys.argv) < 4 :     sys.stderr.write("usage: {0} n k iter\n".format(sys.argv[0]))     sys.stderr.write("\tn = number partitioned\n")     sys.stderr.write("\tk = number of parts\n")     sys.stderr.write("\titer = number of iterations (if iter=0, enumerate partitions)\n")     quit()  n = int(sys.argv[1]) k = int(sys.argv[2]) iters = int(sys.argv[3])  if (n < k) :     sys.stderr.write("error: n<k ({0}<{1})\n".format(n,k))     quit()  # b[n][k] = number of partitions of n largest part equal k b = [[0 j in range(k+1)] in range(n+1)]   def calc_b(n,k) :     j in xrange(1,k+1) :         m in xrange(j, n+1) :             if j == 1 :                 b[m][j] = 1             elif m - j > 0 :                 b[m][j] = b[m-1][j-1] + b[m-j][j]             else :                 b[m][j] = b[m-1][j-1]  def generate(n,k,r=none) :     path = []     append = path.append      # invalid input     if n < k or n == 0 or k == 0:           homecoming []      # pick random number between 1 , b[n][k] if r not specified     if r == none :         r = random.randrange(1,b[n][k]+1)      #  build path r         while r > 0 :         if n==1 , k== 1:             append('n')             r = 0   ### finish loop         elif r <= b[n-k][k] , b[n-k][k] > 0  : # east/west move             append('e')             n = n-k         else : #  northeast/southwest move             append('n')             r -= b[n-k][k]             n = n-1             k = k-1      # decode path partition         partition = []     l = 0     d = 0         append = partition.append         in reversed(path) :         if == 'n' :             if d > 0 : # apply east moves @  1 time                 j in xrange(l) :                     partition[j] += d             d = 0  # reset east moves             append(1) # apply north move             l += 1                     else :             d += 1 # accumulate east moves         if d > 0 : # apply remaining east moves         j in xrange(l) :             partition[j] += d       homecoming partition   t = time.clock() sys.stderr.write("generating b table... ")     calc_b(n, k) sys.stderr.write("done ({0} seconds)\n".format(time.clock()-t))  bmax = b[n][k] bits = 0 sys.stderr.write("b[{0}][{1}]: {2}\t".format(n,k,bmax)) while bmax > 1 :     bmax //= 2     bits += 1 sys.stderr.write("bits: {0}\n".format(bits))  if iters == 0 : # enumerate partitions     in xrange(1,b[n][k]+1) :         print i,"\t",generate(n,k,i)  else : # generate random partitions     t=time.clock()     in xrange(1,iters+1) :         q = generate(n,k)         print q         if i%1000==0 :             sys.stderr.write("{0} written ({1:.3f} seconds)\r".format(i,time.clock()-t))      sys.stderr.write("{0} written ({1:.3f} seconds total) ({2:.3f} iterations per second)\n".format(i, time.clock()-t, float(i)/(time.clock()-t) if time.clock()-t else 0))

and here's examples of performance (on macbook pro 8.3, 2ghz i7, 4 gb, mac osx 10.6.3, python 2.6.1):

mhum$ python part.py 20 5 10 generating b table... done (6.7e-05 seconds) b[20][5]: 84    bits: 6 [7, 6, 5, 1, 1] [6, 6, 5, 2, 1] [5, 5, 4, 3, 3] [7, 4, 3, 3, 3] [7, 5, 5, 2, 1] [8, 6, 4, 1, 1] [5, 4, 4, 4, 3] [6, 5, 4, 3, 2] [8, 6, 4, 1, 1] [10, 4, 2, 2, 2] 10 written (0.000 seconds total) (37174.721 iterations per second)  mhum$ python part.py 20 5 1000000 > /dev/null generating b table... done (5.9e-05 seconds) b[20][5]: 84    bits: 6 100000 written (2.013 seconds total) (49665.478 iterations per second)  mhum$ python part.py 200 25 100000 > /dev/null generating b table... done (0.002296 seconds) b[200][25]: 147151784574    bits: 37 100000 written (8.342 seconds total) (11987.843 iterations per second)  mhum$ python part.py 3000 200 100000 > /dev/null generating b table... done (0.313318 seconds) b[3000][200]: 3297770929953648704695235165404132029244952980206369173   bits: 181 100000 written (59.448 seconds total) (1682.135 iterations per second)  mhum$ python part.py 5000 2000 100000 > /dev/null generating b table... done (4.829086 seconds) b[5000][2000]: 496025142797537184410324290349759736884515893324969819660    bits: 188 100000 written (255.328 seconds total) (391.653 iterations per second)  mhum$ python part-final2.py 20 3 0 generating b table... done (0.0 seconds) b[20][3]: 33    bits: 5 1   [7, 7, 6] 2   [8, 6, 6] 3   [8, 7, 5] 4   [9, 6, 5] 5   [10, 5, 5] 6   [8, 8, 4] 7   [9, 7, 4] 8   [10, 6, 4] 9   [11, 5, 4] 10  [12, 4, 4] 11  [9, 8, 3] 12  [10, 7, 3] 13  [11, 6, 3] 14  [12, 5, 3] 15  [13, 4, 3] 16  [14, 3, 3] 17  [9, 9, 2] 18  [10, 8, 2] 19  [11, 7, 2] 20  [12, 6, 2] 21  [13, 5, 2] 22  [14, 4, 2] 23  [15, 3, 2] 24  [16, 2, 2] 25  [10, 9, 1] 26  [11, 8, 1] 27  [12, 7, 1] 28  [13, 6, 1] 29  [14, 5, 1] 30  [15, 4, 1] 31  [16, 3, 1] 32  [17, 2, 1] 33  [18, 1, 1]

i'll leave op verify code indeed generates partitions according desired (uniform) distribution.

edit: added illustration of enumeration functionality.

algorithm combinatorics sampling random-sample number-theory

Hee

Friday, 15 August 2014

algorithm - Transform a set of large integers into a set of small ones -

No comments:

Post a Comment