Sunday, 15 September 2013

Using Ruby to replace numeric data using simple hashmap -



Using Ruby to replace numeric data using simple hashmap -

i'm trying come simple way using ruby scramble (or mask) numeric data, in order create dummy info set live data. want maintain info close original format possible (that is, preserve non-numeric characters). numbers in info correspond individual identification numbers, (sometimes) keys used in relational database. so, if numeric string occurs more once, want map consistently same (ideally unique) value. 1 time info has been scrambled, don't need able reverse scrambling.

i've created scramble function takes string , generates simple hash map numbers new values (the function maps numeric digits , leaves else is). added security, each time function called, key regenerated. thus, same phrase produce 2 different results each time function called.

module hashmodule def self.scramble(str) numhash ={} 0.upto(9) |i| numhash[i.to_s]=rand(10).to_s end output= string.new(str) output.gsub!(/\d/) do|d| d.replace numhash[d] end puts "input: " + str puts "hash key: " + numhash.to_s puts "output: " + output end end hashmodule.scramble("56609-8 no pct 001") hashmodule.scramble("56609-8 no pct 001")

this produces next output:

input: 56609-8 no pct 001 hash key: {"0"=>"9", "1"=>"4", "2"=>"8", "3"=>"9", "4"=>"4", "5"=>"8", "6"=>"4", "7"=>"0", "8"=>"2", "9"=>"1"} output: 84491-2 no pct 994 input: 56609-8 no pct 001 hash key: {"0"=>"2", "1"=>"0", "2"=>"9", "3"=>"8", "4"=>"4", "5"=>"5", "6"=>"7", "7"=>"4", "8"=>"2", "9"=>"0"} output: 57720-2 no pct 220

given info set:

pto no pc r5632893423 ip r566788882-001 no pct amb pto no amb/call ip a566788882 1655543aachm ip 56664320000000 00566333-1

i first extract numbers array. utilize scramble function created create replacement hash map, e.g.

{"5632893423"=>"5467106076", "566788882"=>"888299995", "001"=>"225", "1655543"=>"2466605", "56664320000000"=>"70007629999999", "00566333"=>"00699999", "1"=>"3"}

[incidentally, in example, haven't found way insist hash values unique, relevant in event string beingness mapped corresponds unique id in relationship database, described above.]

i utilize gsub on original string , replace hash keys scrambled value. code have works, i'm curious larn how can create more concise. realize regenerating key each time function called, create work. (otherwise, create 1 key replace digits).

does have suggestions how can accomplish way? (i'm new ruby, suggestions improving code received).

input = <<eos pto no pc r5632893423 ip r566788882-001 no pct amb pto no amb/call ip a566788882 1655543aachm ip 56664320000000 00566333-1 eos module hashmodule def self.scramble(str) numhash ={} 0.upto(9) |i| numhash[i.to_s]=rand(10).to_s end output= string.new(str) output.gsub!(/\d/) do|d| d.replace numhash[d] end homecoming output end end # extract unique non-null numbers input file numbers = input.split(/[^\d]/).uniq.reject{ |e| e.empty? } # create hash maps each number scrambled value # using function defined above mapper ={} numbers.map(&:to_s).each {|x| mapper[x]=hashmodule.scramble(x)} # create regexp find numbers in input file re = regexp.new(mapper.keys.map { |x| regexp.escape(x) }.join('|')) # replace numbers scrambled values puts input.gsub(re, mapper)

the above code produces next output:

pto no pc r7834913043 ip r799922223-772 no pct amb pto no amb/call ip a799922223 6955509aachm ip 13330271111111 66166777-6

maybe this:

module hashmodule scramblekey = hash[(0..9).map(&:to_s).zip((0..9).to_a.shuffle)] def self.scramble(str); str.gsub(/\d/){scramblekey[$&]} end end puts hashmodule.scramble(input)

which gives:

pto no pc r6907580170 ip r699455557-223 no pct amb pto no amb/call ip a699455557 3966610aachm ip 69991072222222 22699000-3

ruby hashmap scramble

No comments:

Post a Comment