Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-8628

Handle Exception during the stop of log processor pipeline and resource balance service properly

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.1.0
    • Component/s: Log, Metrics
    • Labels:
      None
    • Release Notes:
      Fixed an issue in the log saver and the metrics processor that if an exception was thrown during the changing of the number of instances, a container JVM process could be left running without performing any work.
    • Rank:
      1|hzzvp3:

      Description

      Right now when there is an exception during stop of log processor service, we throw the exception.

      Resource balancer catches this and sets completion to exception.

      but this doesn't cause the container to stop, and the container keeps running, this has to be fixed, else the container keeps running and the kafka paritions assigned to this container wont be processed.

      2017-02-19 10:47:54,719 - WARN  [Endure-Service-KafkaLogProcessorPipeline:c.c.c.c.s.RetryOnStartFailureService@92] - Stop requested for service KafkaLogProcessorPipeline during st
      art.
      2017-02-19 10:47:55,309 - INFO  [LogPipeline-cdap:o.a.t.d.AbstractClientProvider@110] - Service discovered at logdist19028-1002.dev.continuuity.net:45611
      2017-02-19 10:47:55,310 - INFO  [LogPipeline-cdap:o.a.t.d.AbstractClientProvider@118] - Attempting to connect to tx service at logdist19028-1002.dev.continuuity.net:45611 with tim
      eout 30000 ms.
      2017-02-19 10:47:55,330 - INFO  [LogPipeline-cdap:o.a.t.d.AbstractClientProvider@132] - Connected to tx service at logdist19028-1002.dev.continuuity.net:45611
      2017-02-19 10:47:56,412 - ERROR [resource-coordinator-client:c.c.c.c.r.ResourceBalancerService@190] - Failed to change partitions, service: log.framework.
      com.google.common.util.concurrent.UncheckedExecutionException: java.lang.Exception: Service failed to start.
              at com.google.common.util.concurrent.Futures.wrapAndThrowUnchecked(Futures.java:1015) ~[com.google.guava.guava-13.0.1.jar:na]
              at com.google.common.util.concurrent.Futures.getUnchecked(Futures.java:1001) ~[com.google.guava.guava-13.0.1.jar:na]
              at com.google.common.util.concurrent.AbstractService.stopAndWait(AbstractService.java:225) ~[com.google.guava.guava-13.0.1.jar:na]
              at com.google.common.util.concurrent.AbstractIdleService.stopAndWait(AbstractIdleService.java:122) ~[com.google.guava.guava-13.0.1.jar:na]
              at co.cask.cdap.common.resource.ResourceBalancerService$2.onChange(ResourceBalancerService.java:181) ~[na:na]
              at co.cask.cdap.common.zookeeper.coordination.ResourceHandler.onChange(ResourceHandler.java:54) [na:na]
              at co.cask.cdap.common.zookeeper.coordination.ResourceCoordinatorClient$AssignmentChangeListenerCaller$1.run(ResourceCoordinatorClient.java:377) [na:na]
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_75]
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_75]
              at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75]
      Caused by: java.lang.Exception: Service failed to start.
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                terence Terence Yim
                Reporter:
                shankar Shankar Selvam
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: