Thursday, 15 August 2013

user defined functions - storing a file in an already occupied location in Pig -



user defined functions - storing a file in an already occupied location in Pig -

it seems pig prevents reusing output directory. in case, want write pig udf take filename parameter, open file within udf , append contents existing file @ location. possible?

thanks in advance

it may possible, don't know it's advisable. why not have new output directory? example, if want results in /path/to/results, store output of first run /path/to/results/001, next run /path/to/results/002, , on. way can identify bad info failed jobs, , if want of together, can hdfs -cat /path/to/results/*/*.

if don't want append instead want replace existing contents, can utilize pig's rmf shell command:

%define output /path/to/results rmf $output store results '$output';

user-defined-functions apache-pig

No comments:

Post a Comment