java - How hdfs map-reduce actually work in fully distributed mode -
i'm bit confused how hdfs map-reduce work in distributed mode.
suppose running word count program. giving path of 'hdfs-site' & 'core-site'.
then how things beingness carried out?
whether programme distributed on each node or ?
yes, programme distributed. wrong say, distributed every node. it's more, hadoop checks info working with, splits info smaller parts (under constraints configuration) , moves code nodes in hdfs these parts (i assume, have datanode , tasktracker running on nodes). first map part exeuted on these nodes, produces data. info stored on nodes , during mapping finishes sec part of job starts on nodes, reduce-phase.
the reducers started on nodes (again, configure how many of them) , fetch info mappers, aggregate them , send output hdfs.
java hadoop mapreduce hdfs
No comments:
Post a Comment