Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-7420

With impersonation enabled, there is failure to stop YARN applications after certain duration

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 3.5.1, 3.5.0
    • Fix Version/s: 3.5.2
    • Component/s: App Fabric, Security
    • Labels:
    • Release Notes:
      Avoid the caching of YarnClient in order to fix a problem that occurred in namespaces with impersonation configured.
    • Rank:
      1|hzzmyf:

      Description

      With impersonation enabled, there is a failure to stop/kill a program that has been started more than X hours ago. This value X is the kerberos ticket lifetime.
      The reason is that the YarnClient we use to launch the Yarn application is the same one we use to try to kill/stop the application.

      2016-10-08 00:06:14,411 - WARN  [ STOPPING:o.a.h.i.Client$Connection$1@680] - Exception encountered while connecting to the server : 
      javax.security.sasl.SaslException: GSS initiate failed
      	at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212) ~[na:1.7.0_75]
      	at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:413) ~[hadoop-common-2.7.1.2.3.4.7-4.jar:na]
      	at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:558) [hadoop-common-2.7.1.2.3.4.7-4.jar:na]
      	at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:373) [hadoop-common-2.7.1.2.3.4.7-4.jar:na]
      	at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:727) ~[hadoop-common-2.7.1.2.3.4.7-4.jar:na]
      	at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:723) ~[hadoop-common-2.7.1.2.3.4.7-4.jar:na]
      	at java.security.AccessController.doPrivileged(Native Method) ~[na:1.7.0_75]
      	at javax.security.auth.Subject.doAs(Subject.java:415) ~[na:1.7.0_75]
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) ~[hadoop-common-2.7.1.2.3.4.7-4.jar:na]
      	at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:722) [hadoop-common-2.7.1.2.3.4.7-4.jar:na]
      	at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:373) [hadoop-common-2.7.1.2.3.4.7-4.jar:na]
      	at org.apache.hadoop.ipc.Client.getConnection(Client.java:1493) [hadoop-common-2.7.1.2.3.4.7-4.jar:na]
      	at org.apache.hadoop.ipc.Client.call(Client.java:1397) [hadoop-common-2.7.1.2.3.4.7-4.jar:na]
      	at org.apache.hadoop.ipc.Client.call(Client.java:1358) [hadoop-common-2.7.1.2.3.4.7-4.jar:na]
      	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) [hadoop-common-2.7.1.2.3.4.7-4.jar:na]
      	at com.sun.proxy.$Proxy48.getApplicationReport(Unknown Source) [na:na]
      	at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationReport(ApplicationClientProtocolPBClientImpl.java:191) [hadoop-yarn-common-2.7.1.2.3.4.7-4.jar:na]
      	at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) ~[na:na]
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.7.0_75]
      	at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_75]
      	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:252) [hadoop-common-2.7.1.2.3.4.7-4.jar:na]
      	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) [hadoop-common-2.7.1.2.3.4.7-4.jar:na]
      	at com.sun.proxy.$Proxy49.getApplicationReport(Unknown Source) [na:na]
      	at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:431) [hadoop-yarn-client-2.7.1.2.3.4.7-4.jar:na]
      	at org.apache.twill.internal.yarn.Hadoop21YarnAppClient$ProcessControllerImpl.getReport(Hadoop21YarnAppClient.java:179) [co.cask.cdap.cdap-common-3.5.2-SNAPSHOT.jar:0.7.0-incubating]
      	at org.apache.twill.internal.yarn.Hadoop21YarnAppClient$ProcessControllerImpl.getReport(Hadoop21YarnAppClient.java:167) [co.cask.cdap.cdap-common-3.5.2-SNAPSHOT.jar:0.7.0-incubating]
      	at org.apache.twill.yarn.YarnTwillController.doShutDown(YarnTwillController.java:182) [co.cask.cdap.cdap-app-fabric-3.5.2-SNAPSHOT.jar:na]
      	at org.apache.twill.internal.AbstractZKServiceController.shutDown(AbstractZKServiceController.java:98) [org.apache.twill.twill-core-0.7.0-incubating.jar:0.7.0-incubating]
      	at org.apache.twill.internal.AbstractExecutionServiceController$ServiceDelegate.shutDown(AbstractExecutionServiceController.java:180) [org.apache.twill.twill-core-0.7.0-incubating.jar:0.7.0-incubating]
      	at com.google.common.util.concurrent.AbstractIdleService$1$2.run(AbstractIdleService.java:57) [com.google.guava.guava-13.0.1.jar:na]
      	at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75]
      Caused by: org.ietf.jgss.GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
      	at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147) ~[na:1.7.0_75]
      	at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:121) ~[na:1.7.0_75]
      	at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187) ~[na:1.7.0_75]
      	at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:223) ~[na:1.7.0_75]
      	at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212) ~[na:1.7.0_75]
      	at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179) ~[na:1.7.0_75]
      	at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:193) ~[na:1.7.0_75]
      	... 30 common frames omitted
      

        Attachments

          Activity

            People

            • Assignee:
              ali.anwar Ali Anwar
              Reporter:
              ali.anwar Ali Anwar
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: