Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-7483

DynamicPartitioner does not remove files upon failure

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.5.1, 3.5.0, 3.4.0
    • Fix Version/s: 4.0.0
    • Component/s: Datasets, MapReduce
    • Labels:
    • Release Notes:
      Fixes an issue where a MapReduce using DynamicPartitioner would leave behind output files if it failed.
    • Rank:
      1|hzzndr:

      Description

      MapReduce will remove its output files if it failed. However, this is not being done for the case of DynamicPartitioner.
      Needs more investigation to figure out exactly which files remain in the PartitionedFileSet's directory.

      Likely, DynamicPartitioningOutputCommitter needs to override abortJob method to cleanup files that were written by the job.

      Relevant code:
      https://github.com/caskdata/cdap/blob/release/3.4/cdap-data-fabric/src/main/java/co/cask/cdap/data2/dataset2/lib/file/FileSetDataset.java#L263-L285

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                andreas Andreas Neumann
                Reporter:
                ali.anwar Ali Anwar
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: