Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-12421

Scheduler is logging excessively when it fails to launch a program

    Details

    • Type: Bug
    • Status: IN PROGRESS
    • Priority: Critical
    • Resolution: Unresolved
    • Affects Version/s: 4.3.0
    • Fix Version/s: None
    • Component/s: Scheduler
    • Labels:
      None
    • Rank:
      1|i006tr:

      Description

      I have a Pipeline triggering another pipeline. When I start the upstream, for some reason, the scheduler is unable to trigger the downstream. When this happens, the SDK logs are filled up with the following exception:

      2017-08-21 15:45:28,614 - WARN  [constraint-checker-task:c.c.c.s.ConstraintCheckerService$ConstraintCheckerThread@275] - Failed to run program program:default.p2.-SNAPSHOT.workflow.DataPipelineWorkflow in schedule p2.default.p1_v4.default. Skip running this program.
      org.apache.tephra.TransactionFailureException: Exception raised from TxRunnable.run() co.cask.cdap.data2.transaction.Transactions$4@edfaadb
      	at co.cask.cdap.data2.transaction.Transactions$CacheBasedTransactional.finishExecute(Transactions.java:237) ~[na:na]
      	at co.cask.cdap.data2.transaction.Transactions$CacheBasedTransactional.execute(Transactions.java:223) ~[na:na]
      	at co.cask.cdap.data2.transaction.Transactions$5.executeInternal(Transactions.java:295) ~[na:na]
      	at co.cask.cdap.data2.transaction.Transactions$5.execute(Transactions.java:282) ~[na:na]
      	at co.cask.cdap.data2.transaction.Transactions.execute(Transactions.java:259) ~[na:na]
      	at co.cask.cdap.scheduler.ConstraintCheckerService$ConstraintCheckerThread.runReadyJobs(ConstraintCheckerService.java:264) [na:na]
      	at co.cask.cdap.scheduler.ConstraintCheckerService$ConstraintCheckerThread.checkJobQueue(ConstraintCheckerService.java:185) [na:na]
      	at co.cask.cdap.scheduler.ConstraintCheckerService$ConstraintCheckerThread.run(ConstraintCheckerService.java:158) [na:na]
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_72]
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_72]
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_72]
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_72]
      	at java.lang.Thread.run(Thread.java:745) [na:1.8.0_72]
      Caused by: java.lang.NullPointerException: null
      	at java.util.HashMap.putMapEntries(HashMap.java:500) ~[na:1.8.0_72]
      	at java.util.HashMap.<init>(HashMap.java:489) ~[na:1.8.0_72]
      	at co.cask.cdap.api.schedule.ProgramStatusTriggerInfo.<init>(ProgramStatusTriggerInfo.java:61) ~[na:na]
      	at co.cask.cdap.internal.app.runtime.schedule.trigger.ProgramStatusTrigger$2.apply(ProgramStatusTrigger.java:85) ~[na:na]
      	at co.cask.cdap.internal.app.runtime.schedule.trigger.ProgramStatusTrigger$2.apply(ProgramStatusTrigger.java:76) ~[na:na]
      	at co.cask.cdap.internal.app.runtime.schedule.trigger.ProgramStatusTrigger.getTriggerSatisfiedResult(ProgramStatusTrigger.java:124) ~[na:na]
      	at co.cask.cdap.internal.app.runtime.schedule.trigger.ProgramStatusTrigger.getTriggerInfosAddArgumentOverrides(ProgramStatusTrigger.java:89) ~[na:na]
      	at co.cask.cdap.internal.app.runtime.schedule.ScheduleTaskRunner.getTriggeringScheduleInfo(ScheduleTaskRunner.java:107) ~[na:na]
      	at co.cask.cdap.internal.app.runtime.schedule.ScheduleTaskRunner.launch(ScheduleTaskRunner.java:96) ~[na:na]
      	at co.cask.cdap.scheduler.ConstraintCheckerService$ConstraintCheckerThread.runReadyJob(ConstraintCheckerService.java:303) [na:na]
      	at co.cask.cdap.scheduler.ConstraintCheckerService$ConstraintCheckerThread.access$600(ConstraintCheckerService.java:136) [na:na]
      	at co.cask.cdap.scheduler.ConstraintCheckerService$ConstraintCheckerThread$2.call(ConstraintCheckerService.java:267) ~[na:na]
      	at co.cask.cdap.scheduler.ConstraintCheckerService$ConstraintCheckerThread$2.call(ConstraintCheckerService.java:264) ~[na:na]
      	at co.cask.cdap.data2.transaction.Transactions$4.run(Transactions.java:262) ~[na:na]
      	at co.cask.cdap.data2.transaction.Transactions$CacheBasedTransactional.finishExecute(Transactions.java:235) ~[na:na]
      	... 12 common frames omitted
      2017-08-21 15:45:28,621 - WARN  [constraint-checker-task:c.c.c.s.ConstraintCheckerService$ConstraintCheckerThread@275] - Failed to run program program:default.p2.-SNAPSHOT.workflow.DataPipelineWorkflow in schedule p2.default.p1_v4.default. Skip running this program.
      org.apache.tephra.TransactionFailureException: Exception raised from TxRunnable.run() co.cask.cdap.data2.transaction.Transactions$4@7e045e2
      	at co.cask.cdap.data2.transaction.Transactions$CacheBasedTransactional.finishExecute(Transactions.java:237) ~[na:na]
      	at co.cask.cdap.data2.transaction.Transactions$CacheBasedTransactional.execute(Transactions.java:223) ~[na:na]
      	at co.cask.cdap.data2.transaction.Transactions$5.executeInternal(Transactions.java:295) ~[na:na]
      	at co.cask.cdap.data2.transaction.Transactions$5.execute(Transactions.java:282) ~[na:na]
      	at co.cask.cdap.data2.transaction.Transactions.execute(Transactions.java:259) ~[na:na]
      	at co.cask.cdap.scheduler.ConstraintCheckerService$ConstraintCheckerThread.runReadyJobs(ConstraintCheckerService.java:264) [na:na]
      	at co.cask.cdap.scheduler.ConstraintCheckerService$ConstraintCheckerThread.checkJobQueue(ConstraintCheckerService.java:185) [na:na]
      	at co.cask.cdap.scheduler.ConstraintCheckerService$ConstraintCheckerThread.run(ConstraintCheckerService.java:158) [na:na]
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_72]
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_72]
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_72]
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_72]
      	at java.lang.Thread.run(Thread.java:745) [na:1.8.0_72]
      Caused by: java.lang.NullPointerException: null
      	at java.util.HashMap.putMapEntries(HashMap.java:500) ~[na:1.8.0_72]
      	at java.util.HashMap.<init>(HashMap.java:489) ~[na:1.8.0_72]
      	at co.cask.cdap.api.schedule.ProgramStatusTriggerInfo.<init>(ProgramStatusTriggerInfo.java:61) ~[na:na]
      	at co.cask.cdap.internal.app.runtime.schedule.trigger.ProgramStatusTrigger$2.apply(ProgramStatusTrigger.java:85) ~[na:na]
      	at co.cask.cdap.internal.app.runtime.schedule.trigger.ProgramStatusTrigger$2.apply(ProgramStatusTrigger.java:76) ~[na:na]
      	at co.cask.cdap.internal.app.runtime.schedule.trigger.ProgramStatusTrigger.getTriggerSatisfiedResult(ProgramStatusTrigger.java:124) ~[na:na]
      	at co.cask.cdap.internal.app.runtime.schedule.trigger.ProgramStatusTrigger.getTriggerInfosAddArgumentOverrides(ProgramStatusTrigger.java:89) ~[na:na]
      	at co.cask.cdap.internal.app.runtime.schedule.ScheduleTaskRunner.getTriggeringScheduleInfo(ScheduleTaskRunner.java:107) ~[na:na]
      	at co.cask.cdap.internal.app.runtime.schedule.ScheduleTaskRunner.launch(ScheduleTaskRunner.java:96) ~[na:na]
      	at co.cask.cdap.scheduler.ConstraintCheckerService$ConstraintCheckerThread.runReadyJob(ConstraintCheckerService.java:303) [na:na]
      	at co.cask.cdap.scheduler.ConstraintCheckerService$ConstraintCheckerThread.access$600(ConstraintCheckerService.java:136) [na:na]
      	at co.cask.cdap.scheduler.ConstraintCheckerService$ConstraintCheckerThread$2.call(ConstraintCheckerService.java:267) ~[na:na]
      	at co.cask.cdap.scheduler.ConstraintCheckerService$ConstraintCheckerThread$2.call(ConstraintCheckerService.java:264) ~[na:na]
      	at co.cask.cdap.data2.transaction.Transactions$4.run(Transactions.java:262) ~[na:na]
      	at co.cask.cdap.data2.transaction.Transactions$CacheBasedTransactional.finishExecute(Transactions.java:235) ~[na:na]
      	... 12 common frames omitted
      

      .

      When this happens, it is continuously emitting these logs as WARN. We should reduce verbosity - log exception stacktrace as DEBUG once every few mins?

        Attachments

          Activity

            People

            • Assignee:
              bhooshan Bhooshan Mogal
              Reporter:
              bhooshan Bhooshan Mogal
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated: