Tuesday, 15 February 2011

regex - Replacing non-ASCII characters or specific ASCII character with a space in file -



regex - Replacing non-ASCII characters or specific ASCII character with a space in file -

i want replace non-ascii characters or specific ascii characters space in file using shell scripting, sed or perl.

first replace non-ascii characters space in file. know can using below command

perl -pi -e 's/[[:^ascii:]]/ /g'

there ascii characters downstream cannot accept, want replace characters space. example, ascii character value 0x19 (em - end of medium) not accepted downstream , want replace space.

also know range of ascii characters downstream has problem , want replace each of them space.

can help accomplish this?

note: perl version in our scheme 5.8.4. want exercise on solaris 10 machine.

thanks

you can add together them character class in regex. example, remove non-ascii characters, plus \031 , (say) characters in range a-e, write:

perl -pi -e 's/[[:^ascii:]\031a-e]/ /g'

edited add:

for new requirement:

i have replace non ascii characters dec 128 , above exception of dec 145 – 148 , dec 150-151 space.

you can write:

perl -pi -e 's/[^[:ascii:]\x91-\x94\x96\x97]/ /g; s/\031/ /g;'

(note alter [:^ascii:] "non-ascii characters" [:ascii:] "ascii characters", , alter [...] "any of characters ..." [^...] "any character other ...".)

regex perl unix sed solaris

No comments:

Post a Comment