CDAP
  1. CDAP
  2. CDAP-11201

Hive batch source cannot be used with non-native storage handlers

    Details

    • Type: Improvement Improvement
    • Status: Open Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Pipeline Plugins, Pipelines
    • Labels:
      None
    • Rank:
      1|hzzdfj:

      Description

      Not sure if we'll ever fix this, but the problem is that those tables use DatasetStorageHandler, which is an internal CDAP storage handler.

      2016-05-18 02:58:16,550 - ERROR [pcontroller-program:default.ashautest2.workflow.DataPipelineWorkflow-448770c4-1ca4-11e6-a56f-42010a800002:c.c.c.i.a.r.d.AbstractProgramTwillRunnable@337] - Program runner error out.
      java.lang.RuntimeException: java.lang.ClassNotFoundException: co.cask.cdap.hive.datasets.DatasetStorageHandler
      	at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[com.google.guava.guava-13.0.1.jar:na]
      	at co.cask.cdap.internal.app.runtime.workflow.WorkflowDriver.executeAll(WorkflowDriver.java:540) ~[co.cask.cdap.cdap-app-fabric-3.4.1.jar:na]
      	at co.cask.cdap.internal.app.runtime.workflow.WorkflowDriver.run(WorkflowDriver.java:521) ~[co.cask.cdap.cdap-app-fabric-3.4.1.jar:na]
      	at com.google.common.util.concurrent.AbstractExecutionThreadService$1$1.run(AbstractExecutionThreadService.java:52) ~[com.google.guava.guava-13.0.1.jar:na]
      	at java.lang.Thread.run(Thread.java:745) [na:1.7.0_67]
      Caused by: java.lang.ClassNotFoundException: co.cask.cdap.hive.datasets.DatasetStorageHandler
      	at java.net.URLClassLoader$1.run(URLClassLoader.java:366) ~[na:1.7.0_67]
      	at java.net.URLClassLoader$1.run(URLClassLoader.java:355) ~[na:1.7.0_67]
      	at java.security.AccessController.doPrivileged(Native Method) ~[na:1.7.0_67]
      	at java.net.URLClassLoader.findClass(URLClassLoader.java:354) ~[na:1.7.0_67]
      	at java.lang.ClassLoader.loadClass(ClassLoader.java:425) ~[na:1.7.0_67]
      	at java.lang.ClassLoader.loadClass(ClassLoader.java:358) ~[na:1.7.0_67]
      	at java.lang.Class.forName0(Native Method) ~[na:1.7.0_67]
      	at java.lang.Class.forName(Class.java:270) ~[na:1.7.0_67]
      	at org.apache.hive.hcatalog.common.HCatUtil.getStorageHandler(HCatUtil.java:415) ~[na:na]
      	at org.apache.hive.hcatalog.common.HCatUtil.getStorageHandler(HCatUtil.java:367) ~[na:na]
      	at org.apache.hive.hcatalog.mapreduce.InitializeInput.extractPartInfo(InitializeInput.java:158) ~[na:na]
      	at org.apache.hive.hcatalog.mapreduce.InitializeInput.getInputJobInfo(InitializeInput.java:137) ~[na:na]
      	at org.apache.hive.hcatalog.mapreduce.InitializeInput.setInput(InitializeInput.java:86) ~[na:na]
      	at org.apache.hive.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:95) ~[na:na]
      	at co.cask.hydrator.plugin.batch.source.HiveBatchSource.prepareRun(HiveBatchSource.java:98) ~[na:na]
      	at co.cask.hydrator.plugin.batch.source.HiveBatchSource.prepareRun(HiveBatchSource.java:53) ~[na:na]
      	at co.cask.cdap.etl.batch.LoggedBatchConfigurable$1.call(LoggedBatchConfigurable.java:44) ~[na:na]
      	at co.cask.cdap.etl.batch.LoggedBatchConfigurable$1.call(LoggedBatchConfigurable.java:41) ~[na:na]
      	at co.cask.cdap.etl.log.LogContext.run(LogContext.java:59) ~[na:na]
      	at co.cask.cdap.etl.batch.LoggedBatchConfigurable.prepareRun(LoggedBatchConfigurable.java:41) ~[na:na]
      	at co.cask.cdap.etl.batch.mapreduce.ETLMapReduce.beforeSubmit(ETLMapReduce.java:170) ~[na:na]
      	at co.cask.cdap.internal.app.runtime.batch.MapReduceRuntimeService$2.call(MapReduceRuntimeService.java:471) ~[co.cask.cdap.cdap-app-fabric-3.4.1.jar:na]
      	at co.cask.cdap.internal.app.runtime.batch.MapReduceRuntimeService$2.call(MapReduceRuntimeService.java:466) ~[co.cask.cdap.cdap-app-fabric-3.4.1.jar:na]
      	at co.cask.cdap.data2.transaction.Transactions.execute(Transactions.java:174) ~[co.cask.cdap.cdap-data-fabric-3.4.1.jar:na]
      	at co.cask.cdap.internal.app.runtime.batch.MapReduceRuntimeService.beforeSubmit(MapReduceRuntimeService.java:466) ~[co.cask.cdap.cdap-app-fabric-3.4.1.jar:na]
      	at co.cask.cdap.internal.app.runtime.batch.MapReduceRuntimeService.startUp(MapReduceRuntimeService.java:204) ~[co.cask.cdap.cdap-app-fabric-3.4.1.jar:na]
      	at com.google.common.util.concurrent.AbstractExecutionThreadService$1$1.run(AbstractExecutionThreadService.java:47) ~[com.google.guava.guava-13.0.1.jar:na]
      	at co.cask.cdap.internal.app.runtime.batch.MapReduceRuntimeService$1$1.run(MapReduceRuntimeService.java:386) ~[co.cask.cdap.cdap-app-fabric-3.4.1.jar:na]
      	... 1 common frames omitted
      

      The same error would happen if somebody has a hive table with a custom storage handler.

        Activity

        Hide
        Poorna Chandra added a comment -

        The main issue here is able to start and finish transactions in a Hive query to access Datasets. Last time we checked there was no reliable way in Hive to do this. Hive has pre and post hooks for a query, but the hooks are not executed for simple queries like "select * from table".

        Show
        Poorna Chandra added a comment - The main issue here is able to start and finish transactions in a Hive query to access Datasets. Last time we checked there was no reliable way in Hive to do this. Hive has pre and post hooks for a query, but the hooks are not executed for simple queries like "select * from table".

          People

          • Assignee:
            Nitin Motgi
            Reporter:
            Albert Shau
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated: