Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-7497

MapReduce's beforeSubmit is committed, even in case of some failures

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 4.0.0, 3.6.0, 3.5.1
    • Fix Version/s: None
    • Component/s: Datasets, MapReduce
    • Labels:
      None
    • Rank:
      1|hzznhj:

      Description

      Consider a MapReduce that performs some writes in its beforeSubmit method.
      In MapReduceRuntimeService, the beforeSubmit method will execute in a single transaction that commits before even submitting the MapReduce job.
      If there is some exception after this point and before the MapReduce job is executed by the Hadoop framework, the MapReduce job will not run, but the data modified by the MapReduce's beforeSubmit method will still be in place.
      This seems wrong, as a MapReduce's writes should not be committed if it failed.

      TLDR: MapReduce's beforeSubmit dataset writes can be persisted and visible to others, even if there is some failure.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                nitin Nitin Motgi
                Reporter:
                ali.anwar Ali Anwar
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated: