Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-16131

Pipeline logs are lost with Kafka source

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: App Fabric, Pipelines
    • Labels:
      None
    • Rank:
      1|i00sqn:

      Description

      The Kafka 0.10 source doesn't work when running on Spark 2.3.0 due to some incompatibilities with the kafka streaming 2.2.0 library it uses. It fails with errors:

      2019-11-13 15:34:33,404 - ERROR [Driver:o.a.s.d.y.ApplicationMaster@91] - User class threw exception: java.lang.AbstractMethodError
      java.lang.AbstractMethodError: null
      	at org.apache.spark.internal.Logging$class.initializeLogIfNecessary(Logging.scala:99) ~[spark-core_2.11-2.3.0.cloudera4.jar:2.3.0.cloudera4]
      	at org.apache.spark.streaming.kafka010.KafkaUtils$.initializeLogIfNecessary(KafkaUtils.scala:39) ~[spark-streaming-kafka-0-10_2.11-2.2.0.jar:2.2.0]
      	at org.apache.spark.internal.Logging$class.log(Logging.scala:46) ~[spark-core_2.11-2.3.0.cloudera4.jar:2.3.0.cloudera4]
      	at org.apache.spark.streaming.kafka010.KafkaUtils$.log(KafkaUtils.scala:39) ~[spark-streaming-kafka-0-10_2.11-2.2.0.jar:2.2.0]
      	at org.apache.spark.internal.Logging$class.logWarning(Logging.scala:66) ~[spark-core_2.11-2.3.0.cloudera4.jar:2.3.0.cloudera4]
      	at org.apache.spark.streaming.kafka010.KafkaUtils$.logWarning(KafkaUtils.scala:39) ~[spark-streaming-kafka-0-10_2.11-2.2.0.jar:2.2.0]
      	at org.apache.spark.streaming.kafka010.KafkaUtils$.fixKafkaParams(KafkaUtils.scala:201) ~[spark-streaming-kafka-0-10_2.11-2.2.0.jar:2.2.0]
      	at org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.<init>(DirectKafkaInputDStream.scala:63) ~[spark-streaming-kafka-0-10_2.11-2.2.0.jar:2.2.0]
      	at org.apache.spark.streaming.kafka010.KafkaUtils$.createDirectStream(KafkaUtils.scala:147) ~[spark-streaming-kafka-0-10_2.11-2.2.0.jar:2.2.0]
      	at org.apache.spark.streaming.kafka010.KafkaUtils$.createDirectStream(KafkaUtils.scala:124) ~[spark-streaming-kafka-0-10_2.11-2.2.0.jar:2.2.0]
      	at org.apache.spark.streaming.kafka010.KafkaUtils$.createDirectStream(KafkaUtils.scala:168) ~[spark-streaming-kafka-0-10_2.11-2.2.0.jar:2.2.0]
      	at org.apache.spark.streaming.kafka010.KafkaUtils.createDirectStream(KafkaUtils.scala) ~[spark-streaming-kafka-0-10_2.11-2.2.0.jar:2.2.0]
      

      However, these logs are not collected back by CDAP for some reason and are only available through YARN.

        Attachments

          Activity

            People

            • Assignee:
              trishka Trishka
              Reporter:
              ashau Albert Shau
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: