Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-9352

Cannot use group by with wrangler plugin

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.1.1
    • Component/s: None
    • Labels:
      None
    • Rank:
      1|hzzzov:

      Description

      Using a group by as a next stage to wrangler directive gives the following error

      2017-04-12 08:07:52,801 - INFO  [MapReduceRunner-phase-1:c.c.c.i.a.r.b.MapReduceRuntimeService@341] - Submitted MapReduce Job: name=phase-1, jobId=job_local467133320_0010, namespaceId=default, applicationId=CreditCardProcessingPipeline_v3, program=phase-1, runid=ca79ac3a-1f91-11e7-8ecb-d6a9da9a910e.
      2017-04-12 08:07:53,305 - ERROR [Thread-264:o.a.h.m.LocalJobRunnerWithFix@562] - Job job_local467133320_0010 failed
      java.lang.Exception: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: java.io.IOException: Incomplete document
      	at org.apache.hadoop.mapred.LocalJobRunnerWithFix$Job.runTasks(LocalJobRunnerWithFix.java:465) ~[co.cask.cdap.cdap-app-fabric-4.1.0.jar:na]
      	at org.apache.hadoop.mapred.LocalJobRunnerWithFix$Job.run(LocalJobRunnerWithFix.java:524) ~[co.cask.cdap.cdap-app-fabric-4.1.0.jar:na]
      java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: java.io.IOException: Incomplete document
      	at co.cask.cdap.etl.batch.PipeTransformDetail.process(PipeTransformDetail.java:53) ~[cdap-etl-batch-4.1.0.jar:na]
      	at co.cask.cdap.etl.batch.mapreduce.PipeTransformExecutor.runOneIteration(PipeTransformExecutor.java:42) ~[cdap-etl-batch-4.1.0.jar:na]
      	at co.cask.cdap.etl.batch.mapreduce.TransformRunner.transform(TransformRunner.java:158) ~[cdap-etl-batch-4.1.0.jar:na]
      	at co.cask.cdap.etl.batch.mapreduce.ETLMapReduce$ETLMapper.map(ETLMapReduce.java:358) ~[cdap-etl-batch-4.1.0.jar:na]
      	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) ~[org.apache.hadoop.hadoop-mapreduce-client-core-2.3.0.jar:na]
      	at co.cask.cdap.internal.app.runtime.batch.MapperWrapper.run(MapperWrapper.java:119) ~[na:na]
      	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) ~[org.apache.hadoop.hadoop-mapreduce-client-core-2.3.0.jar:na]
      	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) ~[org.apache.hadoop.hadoop-mapreduce-client-core-2.3.0.jar:na]
      	at org.apache.hadoop.mapred.LocalJobRunnerWithFix$Job$MapTaskRunnable.run(LocalJobRunnerWithFix.java:243) ~[co.cask.cdap.cdap-app-fabric-4.1.0.jar:na]
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_79]
      	at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_79]
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_79]
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) ~[na:1.7.0_79]
      	at java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_79]
      Caused by: java.lang.RuntimeException: java.lang.RuntimeException: java.io.IOException: Incomplete document
      	at co.cask.cdap.etl.batch.PipeTransformDetail.process(PipeTransformDetail.java:53) ~[cdap-etl-batch-4.1.0.jar:na]
      	at co.cask.cdap.etl.batch.mapreduce.TransformEmitter.emit(TransformEmitter.java:55) ~[cdap-etl-batch-4.1.0.jar:na]
      	at co.cask.cdap.etl.common.TrackedEmitter.emit(TrackedEmitter.java:45) ~[cdap-etl-core-4.1.0.jar:na]
      	at co.cask.hydrator.plugin.batch.source.FileBatchSource.transform(FileBatchSource.java:210) ~[1492009668603-0/:na]
      	at co.cask.hydrator.plugin.batch.source.FileBatchSource.transform(FileBatchSource.java:70) ~[1492009668603-0/:na]
      	at co.cask.cdap.etl.common.TrackedTransform.transform(TrackedTransform.java:61) ~[cdap-etl-core-4.1.0.jar:na]
      	at co.cask.cdap.etl.batch.PipeTransformDetail.process(PipeTransformDetail.java:48) ~[cdap-etl-batch-4.1.0.jar:na]
      	... 13 common frames omitted
      Caused by: java.lang.RuntimeException: java.io.IOException: Incomplete document
      	at co.cask.cdap.etl.batch.PipeTransformDetail.process(PipeTransformDetail.java:5
      

      Attaching pipeline and test data

        Attachments

          Activity

            People

            • Assignee:
              vinisha Vinisha Shah
              Reporter:
              sree Sreevatsan Raman
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: