user defined functions - storing a file in an already occupied location in Pig -
it seems pig prevents reusing output directory. in case, want write pig udf take filename parameter, open file within udf , append contents existing file @ location. possible?
thanks in advance
it may possible, don't know it's advisable. why not have new output directory? example, if want results in /path/to/results, store output of first run /path/to/results/001, next run /path/to/results/002, , on. way can identify bad info failed jobs, , if want of together, can hdfs -cat /path/to/results/*/*.
if don't want append instead want replace existing contents, can utilize pig's rmf shell command:
%define output /path/to/results rmf $output store results '$output'; user-defined-functions apache-pig
No comments:
Post a Comment