Details

    • Rank:
      1|i00vxj:

      Description

      The Pubsub cloud integration test is failing because the pipeline fails to start. The pipeline gets stuck in the starting state. When I SSH to the dataproc master node, I see the following in the program.log for the run:

      2020-04-20 19:57:09,632 - ERROR [ STARTING:i.c.c.c.l.c.UncaughtExceptionHandler@34] - Uncaught error in thread Thread[ STARTING
      ,5,main], java.lang.NoClassDefFoundError: org/apache/zookeeper/server/ZooKeeperServer
      2020-04-20 19:57:09,650 - ERROR [ STARTING:i.c.c.c.l.c.UncaughtExceptionHandler@35] - Stacktrace for uncaught error in thread T
      hread[ STARTING,5,main]
      java.lang.NoClassDefFoundError: org/apache/zookeeper/server/ZooKeeperServer
              at org.apache.twill.internal.zookeeper.InMemoryZKServer$1.startUp(InMemoryZKServer.java:50) ~[org.apache.twill.twill-zo
      okeeper-0.13.0.jar:0.13.0]
              at com.google.common.util.concurrent.AbstractIdleService$1$1.run(AbstractIdleService.java:43) ~[com.google.guava.guava-
      13.0.1.jar:na]
              at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_242]
      Caused by: java.lang.ClassNotFoundException: org.apache.zookeeper.server.ZooKeeperServer
              at java.net.URLClassLoader.findClass(URLClassLoader.java:382) ~[na:1.8.0_242]
              at io.cdap.cdap.common.lang.InterceptableClassLoader.findClass(InterceptableClassLoader.java:44) ~[na:na]
              at java.lang.ClassLoader.loadClass(ClassLoader.java:419) ~[na:1.8.0_242]
              at java.lang.ClassLoader.loadClass(ClassLoader.java:352) ~[na:1.8.0_242]
              ... 3 common frames omitted
      

      This error never makes it to the CDAP logs. In addition, the remote process is stuck forever shutting down. Taking a stack dump, I see a single thread:

      "main" #1 prio=5 os_prio=0 tid=0x00007f361c00d000 nid=0x162d waiting on condition [0x00007f3624ebe000]
         java.lang.Thread.State: WAITING (parking)
              at sun.misc.Unsafe.park(Native Method)
              - parking to wait for  <0x00000000dd018fb0> (a com.google.common.util.concurrent.AbstractFuture$Sync)
              at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
              at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
              at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
              at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
              at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:280)
              at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)
              at com.google.common.util.concurrent.Uninterruptibles.getUninterruptibly(Uninterruptibles.java:132)
              at com.google.common.util.concurrent.Futures.getUnchecked(Futures.java:999)
              at com.google.common.util.concurrent.AbstractService.stopAndWait(AbstractService.java:225)
              at com.google.common.util.concurrent.AbstractIdleService.stopAndWait(AbstractIdleService.java:122)
              at org.apache.twill.internal.zookeeper.InMemoryZKServer.stopAndWait(InMemoryZKServer.java:152)
              at io.cdap.cdap.internal.app.runtime.distributed.remote.RemoteExecutionJobMain.destroy(RemoteExecutionJobMain.java:178)
              at io.cdap.cdap.internal.app.runtime.distributed.remote.RemoteExecutionJobMain.doMain(RemoteExecutionJobMain.java:101)
              at io.cdap.cdap.internal.app.runtime.distributed.remote.RemoteExecutionJobMain.main(RemoteExecutionJobMain.java:73)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:498)
              at io.cdap.cdap.internal.app.runtime.distributed.remote.RemoteLauncher.main(RemoteLauncher.java:73)
      

      which looks related.

        Attachments

          Activity

            People

            • Assignee:
              vinisha Vinisha Shah
              Reporter:
              ashau Albert Shau
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: