user defined functions - storing a file in an already occupied location in Pig -
it seems pig prevents reusing output directory. in case, want write pig udf take filename parameter, open file within udf , append contents existing file @ location. possible?
thanks in advance
it may possible, don't know it's advisable. why not have new output directory? example, if want results in /path/to/results
, store
output of first run /path/to/results/001
, next run /path/to/results/002
, , on. way can identify bad info failed jobs, , if want of together, can hdfs -cat /path/to/results/*/*
.
if don't want append instead want replace existing contents, can utilize pig's rmf
shell command:
%define output /path/to/results rmf $output store results '$output';
user-defined-functions apache-pig
No comments:
Post a Comment