Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-4045

Containers fail with timeout connecting to zookeeper from KafkaServer

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.3.0
    • Component/s: CDAP
    • Labels:
      None
    • Release Notes:
      Hide
      A new property master.collect.containers.log has been added to cdap-site.xml, which determines if container logs are streamed back to the cdap-master process log. (This has always been the default behavior). For MapR installations, this must be turned off (set to false).
      Show
      A new property master.collect.containers.log has been added to cdap-site.xml, which determines if container logs are streamed back to the cdap-master process log. (This has always been the default behavior). For MapR installations, this must be turned off (set to false).
    • Rank:
      1|hzz20f:

      Description

      I have noticed this when running CDAP against MapR, but seems like it could happen in any cluster with secure zookeeper.

      When CDAP starts up a container, it can intermittently fail with the exception below. The same container being launched on the same node will intermittently exhibit this behavior. It seems related to the kafka client connecting to a SASL-enabled zookeeper, which is the default in MapR distributions.

      18:01:25.714 [EmbeddedKafkaServer STARTING] WARN  o.a.t.i.kafka.EmbeddedKafkaServer - Timeout when connecting to ZooKeeper from KafkaServer. Attempt number 0.
      org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeout: 3000
      	at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:880) ~[com.101tec.zkclient-0.3.jar:0.3]
      	at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:98) ~[com.101tec.zkclient-0.3.jar:0.3]
      	at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:84) ~[com.101tec.zkclient-0.3.jar:0.3]
      	at kafka.common.KafkaZookeeperClient$.getZookeeperClient(Unknown Source) ~[org.apache.kafka.kafka_2.10-0.8.0.jar:0.8.0]
      	at kafka.server.KafkaZooKeeper.startup(Unknown Source) ~[org.apache.kafka.kafka_2.10-0.8.0.jar:0.8.0]
      	at kafka.server.KafkaServer.startup(Unknown Source) ~[org.apache.kafka.kafka_2.10-0.8.0.jar:0.8.0]
      	at org.apache.twill.internal.kafka.EmbeddedKafkaServer.startUp(EmbeddedKafkaServer.java:58) ~[org.apache.twill.twill-core-0.6.0-incubating.jar:0.6.0-incubating]
      	at com.google.common.util.concurrent.AbstractIdleService$1$1.run(AbstractIdleService.java:43) [com.google.guava.guava-13.0.1.jar:na]
      	at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                terence Terence Yim
                Reporter:
                derek Derek Wood
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: