Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-3660

System Services get hdfs auth errors

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 3.1.0, 3.0.0
    • Fix Version/s: 3.0.6, 3.2.0
    • Component/s: CDAP Services
    • Labels:
      None
    • Release Notes:
      Fixed issue where Hadoop filesystem object was getting instantiated before Kerberos keytab login was done. This lead to CDAP processes to fail after the initial ticket expired.
    • Rank:
      1|hzyzqn:

      Description

      The Log service sees errors like:

      15/09/13 05:03:09 WARN security.UserGroupInformation: PriviledgedActionException as:cdap (auth:KERBEROS) cause:org.apache.hadoop.ipc.RemoteException(org.apac
      he.hadoop.security.token.SecretManager$InvalidToken): token (HDFS_DELEGATION_TOKEN token 7 for cdap) is expired
      15/09/13 05:03:09 WARN ipc.Client: Exception encountered while connecting to the server : org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.to
      ken.SecretManager$InvalidToken): token (HDFS_DELEGATION_TOKEN token 7 for cdap) is expired
      15/09/13 05:03:09 WARN security.UserGroupInformation: PriviledgedActionException as:cdap (auth:KERBEROS) cause:org.apache.hadoop.ipc.RemoteException(org.apac
      he.hadoop.security.token.SecretManager$InvalidToken): token (HDFS_DELEGATION_TOKEN token 7 for cdap) is expired
      Exception in thread "HDFSTransactionStateStorage STARTING" java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token
      .SecretManager$InvalidToken): token (HDFS_DELEGATION_TOKEN token 7 for cdap) is expired
              at com.google.common.base.Throwables.propagate(Throwables.java:160)
              at com.google.common.util.concurrent.AbstractIdleService$1$1.run(AbstractIdleService.java:47)
              at java.lang.Thread.run(Thread.java:745)
      Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): token (HDFS_DELEGATION_TOKEN token 7 for cdap)
       is expired
              at org.apache.hadoop.ipc.Client.call(Client.java:1411)
              at org.apache.hadoop.ipc.Client.call(Client.java:1364)
              at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
              at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
              at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:744)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:606)
              at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
              at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
              at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
              at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1925)
              at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1079)
              at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1075)
              at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
              at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1075)
              at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1400)
              at co.cask.tephra.persist.HDFSTransactionStateStorage.startUp(HDFSTransactionStateStorage.java:108)
              at com.google.common.util.concurrent.AbstractIdleService$1$1.run(AbstractIdleService.java:43)
              ... 1 more
      

      Every attempt to restart that container sees the same issue. There were also ~10k attempts to restart before app master died.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                poorna Poorna Chandra
                Reporter:
                ashau Albert Shau
              • Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: