Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-12570

MapReduce programs emit too many metrics


    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 4.3.0
    • Fix Version/s: 4.3.1
    • Component/s: MapReduce, Metrics
    • Labels:
    • Release Notes:
      Added option to enable/disable emitting program metrics and option to include or skip task level information in metrics context. This option can be used with scoping at program and program-type level similar to setting system resources with scoping.
    • Rank:



      • every MapReduce task emits metrics to TMS, each one uses metric tags that include the task id, which becomes part of the row key in the metrics table. If a job has thousands of tasks, then that translates into tens of thousands of write in every iteration of metrics processor
      • the MapReduce dirver (MRRuntimeService) periodically fetches the task report and the job counters and emits metrics for all tasks. This can translate into 10s of thousands of metrics in a single message to TMS, which in turn can exceed the message limit (of 10MB by default).

      We should:

      • evaluate whether any of these metrics are redundant and remove them if so
      • evaluate whether any of these metrics are unneeded and remove them if so
      • have a way to disable (by config/preference) the task-level metrics (instead only record by task type) to avoid the many writes to the metrics table
      • have a way to limit the metric emission from the driver (by config/preference)


          Issue Links



              • Assignee:
                shankar Shankar Selvam
                andreas Andreas Neumann
              • Votes:
                0 Vote for this issue
                2 Start watching this issue


                • Created: