Monday, 15 February 2010

python - extracting columns from text files -



python - extracting columns from text files -

i extract info file named data.txt based on contents of file named list.txt. need extract $11 data.txt , if $1 , $2 of list.txt available in data.txt. $2 of list.txt , $4 of data.txt same.

contents of list.txt 2aas p0877 asds k9876 651a kl098 contents of data.txt 2aas f dnk_ectha q9xt6 12-208 192.0 250.0 198.0 104.00 78.80 99.0 108.0 97 5 asds g dnk_drome k9876 12-209 192.0 250.0 197.0 100.00 78.80 87.0 100.0 97 6 1ot3 h dnk_drome q9bt6 11-208 142.0 256.0 194.0 106.00 78.80 97.0 100.0 97 5 651a h dnk_ectha kl098 10-208 192.0 259.0 197.0 100.00 78.80 98.0 100.0 99 5 2aas h pyp_drome p0877 12-208 192.0 250.0 130.0 102.00 78.80 67.0 103.0 97 9 desired output 2aas p0877 67.0 asds k9876 87.0 651a kl098 98.0

i'm assuming data.txt contains list of info wish "query" using entries list.txt

here's quick , dirty approach using python:

# create info dict using data.txt open("data.txt") f: # create generator of entries using non-empty lines in file entries = (line.split() line in f if line.strip()) # create dict using ($1,$4) key , $11 value info = dict(((d[0], d[3]), d[10]) d in entries) # each entry in list.txt, print out matching info open("list.txt") f: entries = (tuple(line.split()) line in f if line.strip()) e in entries: if e in data: print e[0], e[1], data[e]

running in same directory files gives:

[me@home]$ python extract.py 2aas p0877 67.0 asds k9876 87.0 651a kl098 98.0

or, awk solution:

[me@home]$ awk 'filename==argv[1] {pair[$1" "$4] = $11; next} ($1" "$2 in pair) {printf("%s\t%s\t%s\n", $1, $2, pair[$1" "$2])}' data.txt list.txt 2aas p0877 67.0 asds k9876 87.0 651a kl098 98.0

python awk

No comments:

Post a Comment