Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-11630

NullPointerException in debug logs for spark jobs when ran in integration tests.

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Cannot Reproduce
    • Affects Version/s: None
    • Fix Version/s: 4.3.0
    • Component/s: Spark
    • Labels:
      None
    • Rank:
      1|i0026n:

      Description

      When spark job is executed from integration tests on standalone, following debug messages are printed. I also tried running the same PageRankSpark job in cdap standalone sdk (outside integration tests), but I did not see these debug logs. Even though spark job runs successfully, it would be good to figure out why we are seeing these messages:

      build	25-May-2017 00:44:56	2017-05-25 00:44:56,016 - DEBUG [dag-scheduler-event-loop:o.a.s.Logging$class@83] - Failed to use InputSplit#getLocationInfo.
      build	25-May-2017 00:44:56	java.lang.NullPointerException: null
      build	25-May-2017 00:44:56		at scala.collection.mutable.ArrayOps$ofRef$.length$extension(ArrayOps.scala:114) ~[scala-library-2.10.4.jar:na]
      build	25-May-2017 00:44:56		at scala.collection.mutable.ArrayOps$ofRef.length(ArrayOps.scala:114) ~[scala-library-2.10.4.jar:na]
      build	25-May-2017 00:44:56		at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:32) ~[scala-library-2.10.4.jar:na]
      build	25-May-2017 00:44:56		at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108) ~[scala-library-2.10.4.jar:na]
      build	25-May-2017 00:44:56		at org.apache.spark.rdd.HadoopRDD$.convertSplitLocationInfo(HadoopRDD.scala:412) ~[spark-core_2.10-1.6.1.jar:1.6.1]
      build	25-May-2017 00:44:56		at org.apache.spark.rdd.NewHadoopRDD.getPreferredLocations(NewHadoopRDD.scala:240) ~[spark-core_2.10-1.6.1.jar:1.6.1]
      build	25-May-2017 00:44:56		at org.apache.spark.rdd.RDD$$anonfun$preferredLocations$2.apply(RDD.scala:257) [spark-core_2.10-1.6.1.jar:1.6.1]
      build	25-May-2017 00:44:56		at org.apache.spark.rdd.RDD$$anonfun$preferredLocations$2.apply(RDD.scala:257) [spark-core_2.10-1.6.1.jar:1.6.1]
      build	25-May-2017 00:44:56		at scala.Option.getOrElse(Option.scala:120) [scala-library-2.10.4.jar:na]
      build	25-May-2017 00:44:56		at org.apache.spark.rdd.RDD.preferredLocations(RDD.scala:256) [spark-core_2.10-1.6.1.jar:1.6.1]
      build	25-May-2017 00:44:56		at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$getPreferredLocsInternal(DAGScheduler.scala:1545) [spark-core_2.10-1.6.1.jar:1.6.1]
      build	25-May-2017 00:44:56		at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$getPreferredLocsInternal$2$$anonfun$apply$1.apply$mcVI$sp(DAGScheduler.scala:1556) [spark-core_2.10-1.6.1.jar:1.6.1]
      build	25-May-2017 00:44:56		at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$getPreferredLocsInternal$2$$anonfun$apply$1.apply(DAGScheduler.scala:1555) [spark-core_2.10-1.6.1.jar:1.6.1]
      build	25-May-2017 00:44:56		at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$getPreferredLocsInternal$2$$anonfun$apply$1.apply(DAGScheduler.scala:1555) [spark-core_2.10-1.6.1.jar:1.6.1]
      build	25-May-2017 00:44:56		at scala.collection.immutable.List.foreach(List.scala:318) [scala-library-2.10.4.jar:na]
      build	25-May-2017 00:44:56		at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$getPreferredLocsInternal$2.apply(DAGScheduler.scala:1555) [spark-core_2.10-1.6.1.jar:1.6.1]
      build	25-May-2017 00:44:56		at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$getPreferredLocsInternal$2.apply(DAGScheduler.scala:1553) [spark-core_2.10-1.6.1.jar:1.6.1]
      build	25-May-2017 00:44:56		at scala.collection.immutable.List.foreach(List.scala:318) [scala-library-2.10.4.jar:na]
      build	25-May-2017 00:44:56		at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$getPreferredLocsInternal(DAGScheduler.scala:1553) [spark-core_2.10-1.6.1.jar:1.6.1]
      build	25-May-2017 00:44:56		at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$getPreferredLocsInternal$2$$anonfun$apply$1.apply$mcVI$sp(DAGScheduler.scala:1556) [spark-core_2.10-1.6.1.jar:1.6.1]
      build	25-May-2017 00:44:56		at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$getPreferredLocsInternal$2$$anonfun$apply$1.apply(DAGScheduler.scala:1555) [spark-core_2.10-1.6.1.jar:1.6.1]
      build	25-May-2017 00:44:56		at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$getPreferredLocsInternal$2$$anonfun$apply$1.apply(DAGScheduler.scala:1555) [spark-core_2.10-1.6.1.jar:1.6.1]
      build	25-May-2017 00:44:56		at scala.collection.immutable.List.foreach(List.scala:318) [scala-library-2.10.4.jar:na]
      build	25-May-2017 00:44:56		at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$getPreferredLocsInternal$2.apply(DAGScheduler.scala:1555) [spark-core_2.10-1.6.1.jar:1.6.1]
      build	25-May-2017 00:44:56		at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$getPreferredLocsInternal$2.apply(DAGScheduler.scala:1553) [spark-core_2.10-1.6.1.jar:1.6.1]
      build	25-May-2017 00:44:56		at scala.collection.immutable.List.foreach(List.scala:318) [scala-library-2.10.4.jar:na]
      build	25-May-2017 00:44:56		at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$getPreferredLocsInternal(DAGScheduler.scala:1553) [spark-core_2.10-1.6.1.jar:1.6.1]
      build	25-May-2017 00:44:56		at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$getPreferredLocsInternal$2$$anonfun$apply$1.apply$mcVI$sp(DAGScheduler.scala:1556) [spark-core_2.10-1.6.1.jar:1.6.1]
      build	25-May-2017 00:44:56		at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$getPreferredLocsInternal$2$$anonfun$apply$1.apply(DAGScheduler.scala:1555) [spark-core_2.10-1.6.1.jar:1.6.1]
      build	25-May-2017 00:44:56		at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$getPreferredLocsInternal$2$$anonfun$apply$1.apply(DAGScheduler.scala:1555) [spark-core_2.10-1.6.1.jar:1.6.1]
      build	25-May-2017 00:44:56		at scala.collection.immutable.List.foreach(List.scala:318) [scala-library-2.10.4.jar:na]
      build	25-May-2017 00:44:56		at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$getPreferredLocsInternal$2.apply(DAGScheduler.scala:1555) [spark-core_2.10-1.6.1.jar:1.6.1]
      build	25-May-2017 00:44:56		at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$getPreferredLocsInternal$2.apply(DAGScheduler.scala:1553) [spark-core_2.10-1.6.1.jar:1.6.1]
      build	25-May-2017 00:44:56		at scala.collection.immutable.List.foreach(List.scala:318) [scala-library-2.10.4.jar:na]
      build	25-May-2017 00:44:56		at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$getPreferredLocsInternal(DAGScheduler.scala:1553) [spark-core_2.10-1.6.1.jar:1.6.1]
      

      Build: https://builds.cask.co/download/CDAP-CIT50-JOB1/build_logs/CDAP-CIT50-JOB1-2.log

        Attachments

          Activity

            People

            • Assignee:
              vinisha Vinisha Shah
              Reporter:
              vinisha Vinisha Shah
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: