Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-10202

GroupByAggregate plugin fails to run in spark pipelines

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.5.1
    • Component/s: Pipelines
    • Labels:
      None
    • Release Notes:
      Fixed a bug that caused the Database Source, Joiner, GroupByAggregate, and Deduplicate plugins to fail on certain versions of Spark.
    • Rank:
      1|hzzl9j:

      Description

      This is reproduceable on hdp-2.3.4.7-4. If you run a spark pipeline with the group by aggregate plugin, it will fail with:

      User class threw exception: org.apache.tephra.TransactionFailureException: Exception raised in transactional execution. Cause: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 4, <hostname>): java.lang.NoClassDefFoundError: javax/ws/rs/BadRequestException
      at java.lang.Class.getDeclaredFields0(Native Method)
      at java.lang.Class.privateGetDeclaredFields(Class.java:2499)
      at java.lang.Class.getDeclaredField(Class.java:1951)
      at co.cask.cdap.internal.lang.Fields.findField(Fields.java:39)
      at co.cask.cdap.internal.app.runtime.plugin.PluginInstantiator.newInstance(PluginInstantiator.java:206)
      at co.cask.cdap.internal.app.runtime.DefaultPluginContext.newPluginInstance(DefaultPluginContext.java:95)
      at co.cask.cdap.internal.app.runtime.AbstractContext.newPluginInstance(AbstractContext.java:322)
      at co.cask.cdap.app.runtime.spark.SparkPluginContext.newPluginInstance(SparkPluginContext.java:68)
      at co.cask.cdap.etl.spark.function.PluginFunctionContext.createPlugin(PluginFunctionContext.java:86)
      at co.cask.cdap.etl.spark.function.AggregatorGroupByFunction.call(AggregatorGroupByFunction.java:43)
      at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$3$1.apply(JavaRDDLike.scala:146)
      at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$3$1.apply(JavaRDDLike.scala:146)
      at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
      at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.insertAll(BypassMergeSortShuffleWriter.java:99)
      at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:73)
      at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
      at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
      at org.apache.spark.scheduler.Task.run(Task.scala:88)
      at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      at java.lang.Thread.run(Thread.java:745)
      Caused by: java.lang.ClassNotFoundException: javax.ws.rs.BadRequestException
      at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
      at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
      at java.security.AccessController.doPrivileged(Native Method)
      at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
      at co.cask.cdap.common.lang.InterceptableClassLoader.findClass(InterceptableClassLoader.java:36)
      at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
      at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
      ... 22 more
      

        Attachments

          Activity

            People

            • Assignee:
              ashau Albert Shau
              Reporter:
              ashau Albert Shau
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: