Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-9314

Schema class is not serializable by Kryo

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 4.1.1
    • Fix Version/s: 4.1.1
    • Component/s: None
    • Labels:
      None
    • Release Notes:
      Made a StructuredRecord serializable by Kryo in a Spark program when the Kryo Serializer is used.
    • Rank:
      1|hzzzhb:

      Description

      In CDH 5.11 prerelease running the test "BatchJoinerTest" fails when running the Spark jobs when the the Spark property "spark.serializer" is set to "org.apache.spark.serializer.KryoSerializer". It works when set to "org.apache.spark.serializer.JavaSerializer".

      2017-04-07 22:38:41,431 - WARN  [task-result-getter-1:o.a.s.s.TaskSetManager@70] - Lost task 3.0 in stage 2.0 (TID 27, cdh-prerelease21235-1004.dev.continuuity.net, executor 6): com.esotericsoftware.kryo.KryoException: java.lang.UnsupportedOperationException
      Serialization trace:
      fieldMap (co.cask.cdap.api.data.schema.Schema)
      schema (co.cask.cdap.api.data.format.StructuredRecord)
      	at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:626)
      	at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
      	at com.esotericsoftware.kryo.Kryo.readObjectOrNull(Kryo.java:699)
      	at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:611)
      	at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
      	at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
      	at org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:228)
      	at org.apache.spark.serializer.DeserializationStream.readKey(Serializer.scala:169)
      	at org.apache.spark.serializer.DeserializationStream$$anon$2.getNext(Serializer.scala:201)
      	at org.apache.spark.serializer.DeserializationStream$$anon$2.getNext(Serializer.scala:198)
      	at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
      	at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
      	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
      	at org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:32)
      	at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
      	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
      	at org.apache.spark.util.collection.ExternalAppendOnlyMap.insertAll(ExternalAppendOnlyMap.scala:153)
      	at org.apache.spark.rdd.CoGroupedRDD$$anonfun$compute$4.apply(CoGroupedRDD.scala:153)
      	at org.apache.spark.rdd.CoGroupedRDD$$anonfun$compute$4.apply(CoGroupedRDD.scala:152)
      	at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
      	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
      	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
      	at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
      	at org.apache.spark.rdd.CoGroupedRDD.compute(CoGroupedRDD.scala:152)
      	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
      	at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
      	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
      	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
      	at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
      	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
      	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
      	at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
      	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
      	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
      	at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
      	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
      	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
      	at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:69)
      	at org.apache.spark.rdd.RDD.iterator(RDD.scala:268)
      	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
      	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
      	at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
      	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
      	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
      	at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
      	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
      	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
      	at org.apache.spark.scheduler.Task.run(Task.scala:89)
      	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:745)
      Caused by: java.lang.UnsupportedOperationException
      	at java.util.Collections$UnmodifiableMap.put(Collections.java:1342)
      	at com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:135)
      	at com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17)
      	at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:648)
      	at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605)
      

      Previously all Spark jobs would fail when the property "org.apache.spark.serializer.KryoSerializer" was used. This was fixed in https://issues.cask.co/browse/CDAP-8980 and the PR https://github.com/caskdata/cdap/pull/8440

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                terence Terence Yim
                Reporter:
                mattwuenschel Matt Wuenschel
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: