Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-13785

Error is returned when trying to stop a 'STARTING' run

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Pipelines
    • Labels:
      None
    • Rank:
      1|i00enz:

      Description

      After https://issues.cask.co/browse/CDAP-13620 we allow stopping a program run in 'PENDING' and 'STARTING' states. I got this error when trying to stop a 'STARTING' run:

      2018-07-17 17:08:41,916 - WARN  [runtime-monitor-3:c.c.c.i.a.r.m.RuntimeMonitor@162] - Failed to fetch monitoring data from program run program_run:default.simple_pipeline_v1.-SNAPSHOT.workflow.DataPipelineWorkflow.3f429431-8a1e-11e8-95cc-acde48001122. Will be retried in next iteration.
      co.cask.cdap.common.ServiceUnavailableException: Service 'runtime.monitor' is not available. Please wait until it is up and running.
      	at co.cask.cdap.internal.app.runtime.monitor.RuntimeMonitorClient.fetchMessages(RuntimeMonitorClient.java:104) ~[na:na]
      	at co.cask.cdap.internal.app.runtime.monitor.RuntimeMonitor.runTask(RuntimeMonitor.java:180) ~[na:na]
      	at co.cask.cdap.common.service.AbstractRetryableScheduledService.runOneIteration(AbstractRetryableScheduledService.java:143) ~[na:na]
      	at com.google.common.util.concurrent.AbstractScheduledService$1$1.run(AbstractScheduledService.java:170) [com.google.guava.guava-13.0.1.jar:na]
      	at com.google.common.util.concurrent.AbstractScheduledService$CustomScheduler$ReschedulableCallable.call(AbstractScheduledService.java:355) [com.google.guava.guava-13.0.1.jar:na]
      	at com.google.common.util.concurrent.AbstractScheduledService$CustomScheduler$ReschedulableCallable.call(AbstractScheduledService.java:321) [com.google.guava.guava-13.0.1.jar:na]
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_151]
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_151]
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [na:1.8.0_151]
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_151]
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_151]
      	at java.lang.Thread.run(Thread.java:748) [na:1.8.0_151]
      Caused by: java.net.ConnectException: Connection refused (Connection refused)
      	at java.net.PlainSocketImpl.socketConnect(Native Method) ~[na:1.8.0_151]
      	at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) ~[na:1.8.0_151]
      	at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) ~[na:1.8.0_151]
      	at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) ~[na:1.8.0_151]
      	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[na:1.8.0_151]
      	at java.net.Socket.connect(Socket.java:589) ~[na:1.8.0_151]
      	at sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:673) ~[na:1.8.0_151]
      	at sun.net.NetworkClient.doConnect(NetworkClient.java:175) ~[na:1.8.0_151]
      	at sun.net.www.http.HttpClient.openServer(HttpClient.java:463) ~[na:1.8.0_151]
      	at sun.net.www.http.HttpClient.openServer(HttpClient.java:558) ~[na:1.8.0_151]
      	at sun.net.www.protocol.https.HttpsClient.<init>(HttpsClient.java:264) ~[na:1.8.0_151]
      	at sun.net.www.protocol.https.HttpsClient.New(HttpsClient.java:367) ~[na:1.8.0_151]
      	at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(AbstractDelegateHttpsURLConnection.java:191) ~[na:1.8.0_151]
      	at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1156) ~[na:1.8.0_151]
      	at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1050) ~[na:1.8.0_151]
      	at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:177) ~[na:1.8.0_151]
      	at sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(HttpURLConnection.java:1334) ~[na:1.8.0_151]
      	at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1309) ~[na:1.8.0_151]
      	at sun.net.www.protocol.https.HttpsURLConnectionImpl.getOutputStream(HttpsURLConnectionImpl.java:259) ~[na:1.8.0_151]
      	at co.cask.cdap.internal.app.runtime.monitor.RuntimeMonitorClient.fetchMessages(RuntimeMonitorClient.java:94) ~[na:na]
      	... 11 common frames omitted
      

      The 'Service 'runtime.monitor' is not available. Please wait until it is up and running.' is returned from the backend for the `/stop` endpoint. Interestingly, the run was still able to transition to 'RUNNING' state for a few seconds, before changing to 'FAILED'.

        Attachments

          Activity

            People

            • Assignee:
              bhooshan Bhooshan Mogal
              Reporter:
              tbach Tony Bach
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: