Friday, 15 July 2011

hadoop - Map reduce using hive on cassandra cluster -



hadoop - Map reduce using hive on cassandra cluster -

hi using datastax enterprise hadoop , cassandra integration. have configured 3 cassandra nodes , 2 analytics node(on hive run).

so confused if there info not nowadays on hive nodes on cassandra nodes, not processed during map cut down or map cut down pull info cassandra nodes , run map reduce. please help

so have 4 machines (replication factor 3)

machine 1) cassandra node|token value=0 |data owned(25%) machine 2)-cassandra node|token value=2^127*.5 |data owned(33%) machine 3)-analytics node|token value=2^127*.25 |data owned(33%) machine 4) analytics node|token value=2^127*.75 |data owned(8%)

shouldn't owning 25% each think info replicated in nodes not in 3 nodes

dse create sure total re-create of dataset replicated whichever set of nodes designate analytics. it's non-issue. if plenty analytics nodes fail, may have go non-analytics node fetch info ... you'd improve advised bring analytics nodes online.

hadoop cassandra hive datastax-enterprise

No comments:

Post a Comment