c# - Near perfect hash for guid tostring as dictionary key -
what im trying solve: using guid string key dictionary(string, someobject) , want perfect hashing on key...
not sure if im missing something... when run next test dictionary constructor passing in size allocation +- 10 collisions each run. when pass in iequalitycomparer calling gethashcode on string have test passing good! multiple runs using x = 10 iterations in cases , y upto million! thought dictionary adjusting hashing function when dealing strings? don't have reflector on machine :( cant check tonight... if comment out alternating dictionary initialisations youll see... test runs relatively quick on i7.
[testmethod] public void nearperfecthashingforguidstrings() { int y = 100000; int collisions = 0; //dictionary<string, string> list = new dictionary<string, string>(y, new guidstringhashing()); dictionary<string, string> list = new dictionary<string, string>(y); (int x = 0; x < 5; x++) { enumerable.range(1, y).tolist().foreach((h) => { list[guid.newguid().tostring()] = h.tostring(); }); var hashduplicates = list.keys.groupby(h => h.gethashcode()) .where(group => group.count() > 1) .select(group => group.key).tolist(); hashduplicates.tolist().foreach(v => debug.writeline( x + "--- " + v)); collisions += hashduplicates.count(); list.clear(); } assert.areequal(0, collisions); } public class guidstringhashing : iequalitycomparer<string> { public bool equals(string x, string y) { homecoming gethashcode(x) == gethashcode(y); } public int gethashcode(string obj) { homecoming obj.gethashcode(); } }
your test broken.
because equality comparer incorrectly reports 2 different guids happen have same hash equal, dictionary never stores collisions in first place.
due pigeonhole principle, fundamentally impossible create 32-bit perfect hash more 232 items.
c# .net
No comments:
Post a Comment