Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-4112

Programs using HBase directly have classloading issues

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.3.0
    • Component/s: App Fabric, Datasets
    • Labels:
      None
    • Release Notes:
      Fixed a bug that prevented applications from using HBase directly.
    • Rank:
      1|hzz2d3:

      Description

      This is the error scenario:

      Suppose we have a program that uses HBase because it wants to interact with it directly instead of using CDAP's Table interface. HBase is bundled in the application jar, which means hbase-default.xml is also bundled in the application jar. Suppose the application also uses a system dataset that uses a Table, such as a TimePartitionedFileSet.

      If we run mapreduce, we'll see an exception like:

      co.cask.cdap.api.data.DatasetInstantiationException: Could not instantiate dataset 'somestuff'
      	at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2258)
      	at com.google.common.cache.LocalCache.get(LocalCache.java:3990)
      	at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3994)
      	at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4878)
      	at com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4884)
      	at co.cask.cdap.internal.app.runtime.batch.MapReduceTaskContextProvider.get(MapReduceTaskContextProvider.java:109)
      	at co.cask.cdap.internal.app.runtime.batch.MapperWrapper.run(MapperWrapper.java:64)
      	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
      	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
      	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:415)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
      	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:606)
      	at co.cask.cdap.internal.app.runtime.batch.distributed.MapReduceContainerLauncher.launch(MapReduceContainerLauncher.java:91)
      	at org.apache.hadoop.mapred.YarnChild.main(Unknown Source)
      Caused by: co.cask.cdap.api.data.DatasetInstantiationException: Could not instantiate dataset 'somestuff'
      	at co.cask.cdap.data2.dataset2.SingleThreadDatasetCache.getDataset(SingleThreadDatasetCache.java:142)
      	at co.cask.cdap.data2.dataset2.DynamicDatasetCache.getDataset(DynamicDatasetCache.java:115)
      	at co.cask.cdap.data2.dataset2.DynamicDatasetCache.getDataset(DynamicDatasetCache.java:98)
      	at co.cask.cdap.data2.dataset2.SingleThreadDatasetCache.<init>(SingleThreadDatasetCache.java:120)
      	at co.cask.cdap.internal.app.runtime.AbstractContext.<init>(AbstractContext.java:109)
      	at co.cask.cdap.internal.app.runtime.batch.BasicMapReduceTaskContext.<init>(BasicMapReduceTaskContext.java:95)
      	at co.cask.cdap.internal.app.runtime.batch.MapReduceTaskContextProvider$1.load(MapReduceTaskContextProvider.java:198)
      	at co.cask.cdap.internal.app.runtime.batch.MapReduceTaskContextProvider$1.load(MapReduceTaskContextProvider.java:176)
      	at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3589)
      	at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2374)
      	at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2337)
      	at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2252)
      	... 19 more
      Caused by: com.google.common.util.concurrent.ExecutionError: java.lang.ExceptionInInitializerError
      	at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2256)
      	at com.google.common.cache.LocalCache.get(LocalCache.java:3990)
      	at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3994)
      	at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4878)
      	at co.cask.cdap.data2.dataset2.SingleThreadDatasetCache.getDataset(SingleThreadDatasetCache.java:136)
      	... 30 more
      Caused by: java.lang.ExceptionInInitializerError
      	at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:197)
      	at co.cask.cdap.data2.util.hbase.HBase98TableUtil.createHTable(HBase98TableUtil.java:57)
      	at co.cask.cdap.data2.dataset2.lib.table.hbase.HBaseTable.<init>(HBaseTable.java:99)
      	at co.cask.cdap.data2.dataset2.lib.table.hbase.HBaseTableDefinition.getDataset(HBaseTableDefinition.java:62)
      	at co.cask.cdap.data2.dataset2.lib.table.hbase.HBaseTableDefinition.getDataset(HBaseTableDefinition.java:36)
      	at co.cask.cdap.api.dataset.lib.IndexedTableDefinition.getDataset(IndexedTableDefinition.java:78)
      	at co.cask.cdap.api.dataset.lib.IndexedTableDefinition.getDataset(IndexedTableDefinition.java:36)
      	at co.cask.cdap.data2.dataset2.lib.partitioned.TimePartitionedFileSetDefinition.getDataset(TimePartitionedFileSetDefinition.java:74)
      	at co.cask.cdap.data2.dataset2.lib.partitioned.TimePartitionedFileSetDefinition.getDataset(TimePartitionedFileSetDefinition.java:44)
      	at co.cask.cdap.data2.datafabric.dataset.DatasetType.getDataset(DatasetType.java:54)
      	at co.cask.cdap.data2.datafabric.dataset.AbstractDatasetProvider.get(AbstractDatasetProvider.java:117)
      	at co.cask.cdap.data2.datafabric.dataset.RemoteDatasetFramework.getDataset(RemoteDatasetFramework.java:255)
      	at co.cask.cdap.data2.metadata.writer.LineageWriterDatasetFramework.getDataset(LineageWriterDatasetFramework.java:192)
      	at co.cask.cdap.data.dataset.SystemDatasetInstantiator.getDataset(SystemDatasetInstantiator.java:86)
      	at co.cask.cdap.data2.dataset2.SingleThreadDatasetCache$1.load(SingleThreadDatasetCache.java:87)
      	at co.cask.cdap.data2.dataset2.SingleThreadDatasetCache$1.load(SingleThreadDatasetCache.java:83)
      	at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3589)
      	at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2374)
      	at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2337)
      	at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2252)
      	... 34 more
      Caused by: java.lang.RuntimeException: hbase-default.xml file seems to be for and old version of HBase (0.98.0-hadoop2), this version is 0.98.0.2.1.15.0-946-hadoop2
      	at org.apache.hadoop.hbase.HBaseConfiguration.checkDefaultsVersion(HBaseConfiguration.java:70)
      	at org.apache.hadoop.hbase.HBaseConfiguration.addHbaseResources(HBaseConfiguration.java:102)
      	at org.apache.hadoop.hbase.HBaseConfiguration.create(HBaseConfiguration.java:113)
      	at org.apache.hadoop.hbase.client.ConnectionManager.<clinit>(ConnectionManager.java:211)
      	... 54 more
      

      HBaseConfiguration is complaining about version mismatches, with one version coming from the program and the other coming from the cluster.

      The reason is a bit convoluted. In MapperWrapper, the context classloader is one that can see all plugins and program classes. MapperWrapper creates a MapReduceContext that instantiates all datasets, which includes the TPFS (which uses a Table internally). Eventually, dataset framework instantiates the table that tpfs uses, which creates an HTable that uses ConnectionManager (an HBase class), which has a static block:

        static {
          // We set instances to one more than the value specified for {@link
          // HConstants#ZOOKEEPER_MAX_CLIENT_CNXNS}. By default, the zk default max
          // connections to the ensemble from the one client is 30, so in that case we
          // should run into zk issues before the LRU hit this value of 31.
          MAX_CACHED_CONNECTION_INSTANCES = HBaseConfiguration.create().getInt(
            HConstants.ZOOKEEPER_MAX_CLIENT_CNXNS, HConstants.DEFAULT_ZOOKEPER_MAX_CLIENT_CNXNS) + 1;
          CONNECTION_INSTANCES = new LinkedHashMap<HConnectionKey, HConnectionImplementation>(
              (int) (MAX_CACHED_CONNECTION_INSTANCES / 0.75F) + 1, 0.75F, true) {
            @Override
            protected boolean removeEldestEntry(
                Map.Entry<HConnectionKey, HConnectionImplementation> eldest) {
               return size() > MAX_CACHED_CONNECTION_INSTANCES;
             }
          };
        }
      

      This static block creates a Hadoop Configuration, which uses the context classloader. Therefore, in HBaseConfiguration, when it calls:

      
        public static Configuration addHbaseResources(Configuration conf) {
          conf.addResource("hbase-default.xml");
          conf.addResource("hbase-site.xml");
      
          checkDefaultsVersion(conf);
          checkForClusterFreeMemoryLimit(conf);
          return conf;
        }
      

      The "hbase-default.xml" it adds will be the one the context classloader sees, which is the hbase-default.xml from the application jar, and not from the cluster.

      A possible solution will be for dataset framework to check if a dataset is a system dataset type. If so, it should switch the context classloader to the system classloader in order to instantiate the dataset, then switch back afterwards.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                terence Terence Yim
                Reporter:
                ashau Albert Shau
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: