Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-3980

Multiple input datasets for MapReduce

    Details

    • Release Notes:
      MapReduce supports multiple configured inputs.
    • Rank:
      1|hzy5of:

      Description

      A common use case for mapreduce is to perform a join. When the data to be joined is large, you can emit the join key as the map output key, then perform the join in the reducer.

      For a use case like this, there needs to be a way to use multiple datasets as input to a mapreduce job. Currently, only a single dataset can be used, as there are only context.setInput() methods.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                ali.anwar Ali Anwar
                Reporter:
                ashau Albert Shau
              • Votes:
                1 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: