Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-10269

Cannot read from TPFS Parquet source in hydrator

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.2.1
    • Component/s: Pipelines
    • Labels:
      None
    • Release Notes:
      fixed tpfs plugins to read and write to tpfs filesets for MapReduce and Spark
    • Rank:
      1|hzze9b:

      Description

      While reading from a TPFS parquet source there is a null pointer exception that is thrown while reading data from mapper.

      java.io.EOFException: null
      	at java.io.DataInputStream.readInt(DataInputStream.java:392) ~[na:1.7.0_79]
      	at parquet.hadoop.ParquetInputSplit.readArray(ParquetInputSplit.java:240) ~[com.twitter.parquet-hadoop-bundle-1.6.0rc3.jar:na]
      	at parquet.hadoop.ParquetInputSplit.readUTF8(ParquetInputSplit.java:230) ~[com.twitter.parquet-hadoop-bundle-1.6.0rc3.jar:na]
      	at parquet.hadoop.ParquetInputSplit.readFields(ParquetInputSplit.java:197) ~[com.twitter.parquet-hadoop-bundle-1.6.0rc3.jar:na]
      	at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71) ~[org.apache.hadoop.hadoop-common-2.3.0.jar:na]
      	at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42) ~[org.apache.hadoop.hadoop-common-2.3.0.jar:na]
      	at co.cask.cdap.internal.app.runtime.batch.dataset.input.TaggedInputSplit.readFields(TaggedInputSplit.java:154) ~[co.cask.cdap.cdap-app-fabric-3.4.1.jar:na]
      	at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71) ~[org.apache.hadoop.hadoop-common-2.3.0.jar:na]
      	at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42) ~[org.apache.hadoop.hadoop-common-2.3.0.jar:na]
      	at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:371) ~[org.apache.hadoop.hadoop-mapreduce-client-core-2.3.0.jar:na]
      	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:731) ~[org.apache.hadoop.hadoop-mapreduce-client-core-2.3.0.jar:na]
      	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) ~[org.apache.hadoop.hadoop-mapreduce-client-core-2.3.0.jar:na]
      	at org.apache.hadoop.mapred.LocalJobRunnerWithFix$Job$MapTaskRunnable.run(LocalJobRunnerWithFix.java:243) ~[co.cask.cdap.cdap-app-fabric-3.4.1.jar:na]
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_79]
      	at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_79]
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_79]
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) ~[na:1.7.0_79]
      	at java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_79]
      
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                dshau David Shau
                Reporter:
                sree Sreevatsan Raman
              • Votes:
                0 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: