Java & Regex: Matching a substring that is not preceded by specific characters -
this 1 of questions has been asked , answered hundreds of times over, i'm having hard time adapting other solutions needs.
in java-application have method censoring bad words in chat messages. works of words, there 1 particular (and popular) curse word can't seem rid of. word "faen" (which modern slang "satan", in language in question).
using pattern "fa+e+n" matching multiple a's , e's works; however, in language, word "that couch" or "that sofa" "sofaen". i've tried lot of different approaches, using variations of [^so] , (?!=so), far haven't been able find way match 1 , not other.
the real goal here, able match bad words, regardless of number of vowels, , regardless of non-letters in between components of word.
here's few examples of i'm trying do:
"string containing faen" should match "string containing sofaen" should not match "non-letter-censored string f-a@a-e.n" should match "non-letter-censored string sof-a@a-e.n" should not match
any tips set me off in right direction on this?
you want \bf[^\s]+a[^\s]+e[^\s]+n[^\s]\b
. note regular expression; if want java need utilize \\b[^\\s]+f[^\\s]+a[^\\s]+e[^\\s]+n[^\\s]\b
.
note isn't perfect, handle situations have suggested.
java regex
No comments:
Post a Comment