Sorting a 20GB file with one string per line -
in question 11.5 of gayle laakman's book, cracking technical interview,
"imagine have 20gb file 1 string per line. explain how sort file"
my initial reaction solution proposed - splitting file smaller chunks (megabytes) reading in x mb's of data, sorting it, , writing disk. , @ end, merge files.
i decided not pursue approach because final merge involve holding on info in main memory - , we're assuming that's not possible. if that's case, how solution hold?
my other approach based on assumption have near unlimited disk space, or @ to the lowest degree plenty hold 2x info have. can read in x mb's of info , generate hash keys them - each key corresponding line in file. we'll go on doing until values have been hashed. have write values of file original file.
let me know think.
http://en.wikipedia.org/wiki/external_sorting gives more detailed explanation on how external sorting works. addresses concern of having bring entire 20gb memory explaining how perform final merge of n sorted chunks reading in chunks of sorted chunks opposed reading in sorted chunks @ same time.
sorting
No comments:
Post a Comment