Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-9981

HBase sink fails when used as one of multiple sinks

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.3.0
    • Fix Version/s: 3.3.1
    • Component/s: Pipelines
    • Labels:
      None
    • Release Notes:
      Fixed HBase Sink to work when used as one of multiple outputs.
    • Rank:
      1|hzz5fr:

      Description

      If you try and use the HBase sink as one of multiple sinks in a Hydrator pipeline, the pipeline will fail and you will see:

      WARNING: Exception running child : java.lang.NullPointerException
              at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:126)
              at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:87)
              at co.cask.cdap.internal.app.runtime.batch.dataset.MultipleOutputs$MeteredRecordWriter.write(MultipleOutputs.java:272)
              at co.cask.cdap.internal.app.runtime.batch.dataset.MultipleOutputs.write(MultipleOutputs.java:176)
              at co.cask.cdap.internal.app.runtime.batch.BasicMapReduceTaskContext.write(BasicMapReduceTaskContext.java:150)
              at co.cask.cdap.internal.app.runtime.batch.MapReduceLifecycleContext.write(MapReduceLifecycleContext.java:62)
              at co.cask.cdap.etl.batch.mapreduce.ETLMapReduce$MultiOutputSink.write(ETLMapReduce.java:508)
              at co.cask.cdap.etl.batch.mapreduce.ETLMapReduce$ETLMapper.map(ETLMapReduce.java:410)
              at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
              at co.cask.cdap.internal.app.runtime.batch.MapperWrapper.run(MapperWrapper.java:102)
              at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
              at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
              at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:415)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
              at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:606)
              at co.cask.cdap.internal.app.runtime.batch.distributed.MapReduceContainerLauncher.launch(MapReduceContainerLauncher.java:91)
              at org.apache.hadoop.mapred.YarnChild.main(Unknown Source)
      

      Looking at the code, this appears to be because TableOutputFormat sets a private variable when setConf() is called on it. So MultipleOutputs must not be calling setConf() on its delegate output formats.

        Attachments

          Activity

            People

            • Assignee:
              ali.anwar Ali Anwar
              Reporter:
              ashau Albert Shau
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: