Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-14159

CDAP uses incompatible or old version of Python

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: 6.1.0
    • Component/s: Pipeline Plugins
    • Labels:
    • Rank:
      1|i00grz:

      Description

      Trying to use bytearray in transform() in Python plugin gives me this error:

       

      2018-08-18 23:21:25,077 - WARN [LocalJobRunner Map Task Executor #0:c.c.h.p.PDFExtractor@124] - Caught co.cask.cdap.etl.batch.StageFailureException. Continuing since continueOnError is true. Exception: co.cask.cdap.etl.batch.StageFailureException: Failed to execute pipeline stage 'Python' with the error: Could not transform input.
      Traceback (most recent call last):
       File "<script>", line 28, in <module>
       File "<script>", line 19, in transform
      NameError: global name 'bytearray' is not defined
      at org.python.core.Py.NameError(Py.java:246)
       at org.python.core.PyFrame.getglobal(PyFrame.java:265)
       at org.python.pycode._pyx3.transform$1(<script>:26)
       at org.python.pycode._pyx3.call_function(<script>)
       at org.python.core.PyTableCode.call(PyTableCode.java:165)
       at org.python.core.PyBaseCode.call(PyBaseCode.java:166)
       at org.python.core.PyFunction.__call__(PyFunction.java:338)
       at org.python.pycode._pyx3.f$0(<script>:28)
       at org.python.pycode._pyx3.call_function(<script>)
       at org.python.core.PyTableCode.call(PyTableCode.java:165)
       at org.python.core.PyCode.call(PyCode.java:18)
       at org.python.core.Py.runCode(Py.java:1261)
       at co.cask.hydrator.plugin.transform.PythonEvaluator.transform(PythonEvaluator.java:178)
       at co.cask.hydrator.plugin.transform.PythonEvaluator.transform(PythonEvaluator.java:55)
       at co.cask.cdap.etl.common.plugin.WrappedTransform$6.call(WrappedTransform.java:107)
       at co.cask.cdap.etl.common.plugin.WrappedTransform$6.call(WrappedTransform.java:104)
       at co.cask.cdap.etl.common.plugin.Caller$1.call(Caller.java:30)
       at co.cask.cdap.etl.common.plugin.StageLoggingCaller.call(StageLoggingCaller.java:40)
       at co.cask.cdap.etl.common.plugin.WrappedTransform.transform(WrappedTransform.java:104)
       at co.cask.cdap.etl.common.TrackedTransform.transform(TrackedTransform.java:74)
       at co.cask.cdap.etl.batch.UnwrapPipeStage.consumeInput(UnwrapPipeStage.java:44)
       at co.cask.cdap.etl.batch.UnwrapPipeStage.consumeInput(UnwrapPipeStage.java:32)
       at co.cask.cdap.etl.batch.PipeStage.consume(PipeStage.java:44)
       at co.cask.cdap.etl.batch.PipeEmitter.emit(PipeEmitter.java:83)
       at co.cask.cdap.etl.common.TrackedEmitter.emit(TrackedEmitter.java:57)
       at co.cask.cdap.etl.common.plugin.UntimedEmitter.emit(UntimedEmitter.java:64)
       at co.cask.hydrator.plugin.PDFExtractor.transform(PDFExtractor.java:93)
       at co.cask.hydrator.plugin.PDFExtractor.transform(PDFExtractor.java:43)
       at co.cask.cdap.etl.common.plugin.WrappedTransform$6.call(WrappedTransform.java:107)
       at co.cask.cdap.etl.common.plugin.WrappedTransform$6.call(WrappedTransform.java:104)
       at co.cask.cdap.etl.common.plugin.Caller$1.call(Caller.java:30)
       at co.cask.cdap.etl.common.plugin.StageLoggingCaller.call(StageLoggingCaller.java:40)
       at co.cask.cdap.etl.common.plugin.WrappedTransform.transform(WrappedTransform.java:104)
       at co.cask.cdap.etl.common.TrackedTransform.transform(TrackedTransform.java:74)
       at co.cask.cdap.etl.batch.UnwrapPipeStage.consumeInput(UnwrapPipeStage.java:44)
       at co.cask.cdap.etl.batch.UnwrapPipeStage.consumeInput(UnwrapPipeStage.java:32)
       at co.cask.cdap.etl.batch.PipeStage.consume(PipeStage.java:44)
       at co.cask.cdap.etl.batch.PipeEmitter.emit(PipeEmitter.java:83)
       at co.cask.cdap.etl.common.TrackedEmitter.emit(TrackedEmitter.java:57)
       at co.cask.cdap.etl.common.plugin.UntimedEmitter.emit(UntimedEmitter.java:64)
       at co.cask.WholeFileSource.transform(WholeFileSource.java:95)
       at co.cask.cdap.etl.common.plugin.WrappedBatchSource$3.call(WrappedBatchSource.java:77)
       at co.cask.cdap.etl.common.plugin.WrappedBatchSource$3.call(WrappedBatchSource.java:74)
       at co.cask.cdap.etl.common.plugin.Caller$1.call(Caller.java:30)
       at co.cask.cdap.etl.common.plugin.StageLoggingCaller.call(StageLoggingCaller.java:40)
       at co.cask.cdap.etl.common.plugin.WrappedBatchSource.transform(WrappedBatchSource.java:74)
       at co.cask.cdap.etl.common.plugin.WrappedBatchSource.transform(WrappedBatchSource.java:36)
       at co.cask.cdap.etl.common.preview.LimitingTransform.transform(LimitingTransform.java:44)
       at co.cask.cdap.etl.common.TrackedTransform.transform(TrackedTransform.java:74)
       at co.cask.cdap.etl.batch.UnwrapPipeStage.consumeInput(UnwrapPipeStage.java:44)
       at co.cask.cdap.etl.batch.UnwrapPipeStage.consumeInput(UnwrapPipeStage.java:32)
       at co.cask.cdap.etl.batch.PipeStage.consume(PipeStage.java:44)
       at co.cask.cdap.etl.batch.PipeTransformExecutor.runOneIteration(PipeTransformExecutor.java:43)
       at co.cask.cdap.etl.batch.mapreduce.TransformRunner.transform(TransformRunner.java:142)
       at co.cask.cdap.etl.batch.mapreduce.ETLMapReduce$ETLMapper.map(ETLMapReduce.java:458)
       at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
       at co.cask.cdap.internal.app.runtime.batch.MapperWrapper.run(MapperWrapper.java:135)
       at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
       at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:270)
       at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
       at java.lang.Thread.run(Thread.java:748)
      . Please review your pipeline configuration and check the system logs for more details.
      

       

       

      This seems to be because CDAP is using some old version of Python. On my machine both python2 and python3 report no issues when I call bytearray(). Also, is there way to configure CDAP to use host python rather then embedded python?

       

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                poorna Poorna Chandra
                Reporter:
                dimon777 Dimon
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: