Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-1227

Partitioned file set should add new partitions in the output committer

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 2.8.0, 2.7.1
    • Fix Version/s: 3.0.0
    • Component/s: App Fabric, Datasets
    • Labels:
    • Rank:
      1|hzyln3:

      Description

      Currently, when writing to a TPFS from Map/Reduce, the dataset delegates writing to the underlying output format. That output format writes the correct file in the correct format, but it does not register the file as a new partition of the dataset. Therefore, the addPartition() call must be made in the onFinish() method of the MR program.

      The correct way to do this is to implement a wrapper output format that delegates to the underlying output format, but overrides the output committer to add the partition. This requires injecting the dataset framework into the output committer and will therefore involve changes in the Map/Reduce framework as well as in the TPFS dataset.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                andreas Andreas Neumann
                Reporter:
                andreas Andreas Neumann
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: