Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-14276

KafkaOffsetResolverTest is flaky

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 5.1.0
    • Fix Version/s: 5.1.0
    • Component/s: Log, Test
    • Labels:
    • Rank:
      1|i00hfj:

      Description

      The test fails to start Kafka because the address is already in use. It has a retry logic but every attempt uses same port and consequently fails again. 

      2018-09-06 17:03:33,822 - WARN  [EmbeddedKafkaServer STARTING:o.a.t.i.k.EmbeddedKafkaServer@76] - Kafka failed to bind to port 34141. Attempt number 0.
      java.net.BindException: Address already in use
      	at sun.nio.ch.Net.bind0(Native Method) ~[na:1.8.0_101]
      	at sun.nio.ch.Net.bind(Net.java:433) ~[na:1.8.0_101]
      	at sun.nio.ch.Net.bind(Net.java:425) ~[na:1.8.0_101]
      	at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223) ~[na:1.8.0_101]
      	at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) ~[na:1.8.0_101]
      	at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67) ~[na:1.8.0_101]
      	at kafka.network.Acceptor.openServerSocket(SocketServer.scala:256) ~[kafka_2.10-0.8.2.2.jar:na]
      	at kafka.network.Acceptor.<init>(SocketServer.scala:205) ~[kafka_2.10-0.8.2.2.jar:na]
      	at kafka.network.SocketServer.startup(SocketServer.scala:86) ~[kafka_2.10-0.8.2.2.jar:na]
      	at kafka.server.KafkaServer.startup(KafkaServer.scala:99) ~[kafka_2.10-0.8.2.2.jar:na]
      	at org.apache.twill.internal.kafka.EmbeddedKafkaServer.startUp(EmbeddedKafkaServer.java:65) ~[twill-core-0.13.0.jar:0.13.0]
      	at com.google.common.util.concurrent.AbstractIdleService$1$1.run(AbstractIdleService.java:43) [guava-13.0.1.jar:na]
      	at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]
      2018-09-06 17:03:34,038 - WARN  [EmbeddedKafkaServer STARTING:o.a.t.i.k.EmbeddedKafkaServer@76] - Kafka failed to bind to port 34141. Attempt number 1.
      java.net.BindException: Address already in use
      	at sun.nio.ch.Net.bind0(Native Method) ~[na:1.8.0_101]
      	at sun.nio.ch.Net.bind(Net.java:433) ~[na:1.8.0_101]
      	at sun.nio.ch.Net.bind(Net.java:425) ~[na:1.8.0_101]
      	at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223) ~[na:1.8.0_101]
      	at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) ~[na:1.8.0_101]
      	at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67) ~[na:1.8.0_101]
      	at kafka.network.Acceptor.openServerSocket(SocketServer.scala:256) ~[kafka_2.10-0.8.2.2.jar:na]
      	at kafka.network.Acceptor.<init>(SocketServer.scala:205) ~[kafka_2.10-0.8.2.2.jar:na]
      	at kafka.network.SocketServer.startup(SocketServer.scala:86) ~[kafka_2.10-0.8.2.2.jar:na]
      	at kafka.server.KafkaServer.startup(KafkaServer.scala:99) ~[kafka_2.10-0.8.2.2.jar:na]
      	at org.apache.twill.internal.kafka.EmbeddedKafkaServer.startUp(EmbeddedKafkaServer.java:65) ~[twill-core-0.13.0.jar:0.13.0]
      	at com.google.common.util.concurrent.AbstractIdleService$1$1.run(AbstractIdleService.java:43) [guava-13.0.1.jar:na]
      	at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]
      2018-09-06 17:03:34,288 - WARN  [EmbeddedKafkaServer STARTING:o.a.t.i.k.EmbeddedKafkaServer@76] - Kafka failed to bind to port 34141. Attempt number 2.
      java.net.BindException: Address already in use
      	at sun.nio.ch.Net.bind0(Native Method) ~[na:1.8.0_101]
      	at sun.nio.ch.Net.bind(Net.java:433) ~[na:1.8.0_101]
      	at sun.nio.ch.Net.bind(Net.java:425) ~[na:1.8.0_101]
      	at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223) ~[na:1.8.0_101]
      	at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) ~[na:1.8.0_101]
      	at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67) ~[na:1.8.0_101]
      	at kafka.network.Acceptor.openServerSocket(SocketServer.scala:256) ~[kafka_2.10-0.8.2.2.jar:na]
      	at kafka.network.Acceptor.<init>(SocketServer.scala:205) ~[kafka_2.10-0.8.2.2.jar:na]
      	at kafka.network.SocketServer.startup(SocketServer.scala:86) ~[kafka_2.10-0.8.2.2.jar:na]
      	at kafka.server.KafkaServer.startup(KafkaServer.scala:99) ~[kafka_2.10-0.8.2.2.jar:na]
      	at org.apache.twill.internal.kafka.EmbeddedKafkaServer.startUp(EmbeddedKafkaServer.java:65) ~[twill-core-0.13.0.jar:0.13.0]
      	at com.google.common.util.concurrent.AbstractIdleService$1$1.run(AbstractIdleService.java:43) [guava-13.0.1.jar:na]
      	at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]
      2018-09-06 17:03:34,331 - WARN  [EmbeddedKafkaServer STARTING:o.a.t.i.k.EmbeddedKafkaServer@76] - Kafka failed to bind to port 34141. Attempt number 3.
      java.net.BindException: Address already in use
      	at sun.nio.ch.Net.bind0(Native Method) ~[na:1.8.0_101]
      	at sun.nio.ch.Net.bind(Net.java:433) ~[na:1.8.0_101]
      	at sun.nio.ch.Net.bind(Net.java:425) ~[na:1.8.0_101]
      	at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223) ~[na:1.8.0_101]
      	at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) ~[na:1.8.0_101]
      	at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67) ~[na:1.8.0_101]
      	at kafka.network.Acceptor.openServerSocket(SocketServer.scala:256) ~[kafka_2.10-0.8.2.2.jar:na]
      	at kafka.network.Acceptor.<init>(SocketServer.scala:205) ~[kafka_2.10-0.8.2.2.jar:na]
      	at kafka.network.SocketServer.startup(SocketServer.scala:86) ~[kafka_2.10-0.8.2.2.jar:na]
      	at kafka.server.KafkaServer.startup(KafkaServer.scala:99) ~[kafka_2.10-0.8.2.2.jar:na]
      	at org.apache.twill.internal.kafka.EmbeddedKafkaServer.startUp(EmbeddedKafkaServer.java:65) ~[twill-core-0.13.0.jar:0.13.0]
      	at com.google.common.util.concurrent.AbstractIdleService$1$1.run(AbstractIdleService.java:43) [guava-13.0.1.jar:na]
      	at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]
      2018-09-06 17:03:34,595 - WARN  [EmbeddedKafkaServer STARTING:o.a.t.i.k.EmbeddedKafkaServer@76] - Kafka failed to bind to port 34141. Attempt number 4.
      java.net.BindException: Address already in use
      	at sun.nio.ch.Net.bind0(Native Method) ~[na:1.8.0_101]
      	at sun.nio.ch.Net.bind(Net.java:433) ~[na:1.8.0_101]
      	at sun.nio.ch.Net.bind(Net.java:425) ~[na:1.8.0_101]
      	at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223) ~[na:1.8.0_101]
      	at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) ~[na:1.8.0_101]
      	at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67) ~[na:1.8.0_101]
      	at kafka.network.Acceptor.openServerSocket(SocketServer.scala:256) ~[kafka_2.10-0.8.2.2.jar:na]
      	at kafka.network.Acceptor.<init>(SocketServer.scala:205) ~[kafka_2.10-0.8.2.2.jar:na]
      	at kafka.network.SocketServer.startup(SocketServer.scala:86) ~[kafka_2.10-0.8.2.2.jar:na]
      	at kafka.server.KafkaServer.startup(KafkaServer.scala:99) ~[kafka_2.10-0.8.2.2.jar:na]
      	at org.apache.twill.internal.kafka.EmbeddedKafkaServer.startUp(EmbeddedKafkaServer.java:65) ~[twill-core-0.13.0.jar:0.13.0]
      	at com.google.common.util.concurrent.AbstractIdleService$1$1.run(AbstractIdleService.java:43) [guava-13.0.1.jar:na]
      	at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]
      2018-09-06 17:03:34,626 - ERROR [EmbeddedKafkaServer STARTING:o.a.z.s.NIOServerCnxnFactory$1@44] - Thread Thread[EmbeddedKafkaServer STARTING,5,main] died 

      The Problem is that EmbeddedKafkaServer receives a port number in its configuration from KafkaTester. That is why it does not try another, random port when it retries. However, if it is configured with port 0, there is no way to query it for the random port it uses. This class is in twill, and an easy fix would require a change in Twill. 

      Another option is to disable the retry in EmbeddedKafkaServer and add a getRandomPort/retry loop in the caller, KafkaTester. That does not seem the best solution as it reinvents an existing feature.

        Attachments

          Activity

            People

            • Assignee:
              terence Terence Yim
              Reporter:
              andreas Andreas Neumann
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: