Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-7476

Refactor MapReduceRuntimeService to configure inputs and outputs during initialize()

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 4.0.0
    • Fix Version/s: 4.0.0
    • Component/s: Datasets, MapReduce
    • Labels:
      None
    • Release Notes:
      Improves how MapReduce configures its inputs, such that failures surface immediately.
    • Rank:
      1|hzznbr:

      Description

      Currently, when the program's initialize() method calls addInput() or addOutput(), we only remember what the input or output is; we then call the corresponding dataset after initialize() returns to determine the configuration for the inputs and output. But if initialize uses explicit transactions, that means that this second step happens in a different transaction than where it was invoked. That can lead to unexpected behavior.

      So far, this could not be done much differently, because we also allowed setInput/OutputDataset() inside configure(), which does not have a transaction. These are removed by CDAP-7475, allowing for this refactoring is 4.0.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                andreas Andreas Neumann
                Reporter:
                andreas Andreas Neumann
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: