Thursday, 15 March 2012

algorithm - Finding text duplicates - easy to implement -



algorithm - Finding text duplicates - easy to implement -

i looking nice easy implement algorithms find duplicate texts in cms. saving text column removed white spaces , made characters lowcase can find duplicates if different amount of spaces , letter cases, it's not enough.

how can handle situation 2 texts different few characters , want them recognized duplicates?

the simple solution problem utilize soundex check. convert each word soundex equivalent, eliminate little words, , if records same, match. crude, effective.

algorithm text duplicates

No comments:

Post a Comment