Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-11972

Delegation token error while writing to stream

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 4.2.0
    • Fix Version/s: None
    • Component/s: Security, Streams
    • Labels:
      None
    • Rank:
      1|i0046v:

      Description

      To reproduce this issue, have a Kerberos-enabled CDAP cluster. Delegation token expiry configuration parameters are using hadoop defaults (~24 hours).
      Write to a stream every ~15 minutes. After ~24 hours, one of the stream writes will fail. The stream write after this will succeed.

      This is the error seen in master logs when the failing stream write occurs:

      2017-06-22 21:00:36,630 - ERROR [streams-worker-thread-9:o.a.h.i.Client@1427] - Failed to write upload content to stream stream:lrtns.intStream
      org.apache.hadoop.ipc.RemoteException: token (HDFS_DELEGATION_TOKEN token 20402 for cdap) is expired
              at org.apache.hadoop.ipc.Client.call(Client.java:1427)
              at org.apache.hadoop.ipc.Client.call(Client.java:1358)
              at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
              at com.sun.proxy.$Proxy9.complete(Unknown Source)
              at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.complete(ClientNamenodeProtocolTranslatorPB.java:462)
              at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:606)
              at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:252)
              at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
              at com.sun.proxy.$Proxy10.complete(Unknown Source)
              at org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:2358)
              at org.apache.hadoop.hdfs.DFSOutputStream.closeImpl(DFSOutputStream.java:2340)
              at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:2304)
              at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
              at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106)
              at co.cask.cdap.data.stream.StreamDataFileWriter.close(StreamDataFileWriter.java:181)
              at co.cask.cdap.data.file.PartitionedFileWriter.closePartitionWriter(PartitionedFileWriter.java:174)
              at co.cask.cdap.data.stream.TimePartitionedStreamFileWriter.partitionChanged(TimePartitionedStreamFileWriter.java:88)
              at co.cask.cdap.data.stream.TimePartitionedStreamFileWriter.partitionChanged(TimePartitionedStreamFileWriter.java:60)
              at co.cask.cdap.data.file.PartitionedFileWriter.getWriter(PartitionedFileWriter.java:146)
              at co.cask.cdap.data.file.PartitionedFileWriter.appendAll(PartitionedFileWriter.java:92)
              at co.cask.cdap.data.stream.service.ConcurrentStreamWriter$EventQueue$TransformingFileWriter.appendAll(ConcurrentStreamWriter.java:685)
              at co.cask.cdap.data.stream.service.ConcurrentStreamWriter$BatchWriteRequest.write(ConcurrentStreamWriter.java:796)
              at co.cask.cdap.data.stream.service.ConcurrentStreamWriter$EventQueue.tryWrite(ConcurrentStreamWriter.java:557)
              at co.cask.cdap.data.stream.service.ConcurrentStreamWriter.persistUntilCompleted(ConcurrentStreamWriter.java:308)
              at co.cask.cdap.data.stream.service.ConcurrentStreamWriter.enqueue(ConcurrentStreamWriter.java:191)
              at co.cask.cdap.data.stream.service.upload.BufferedContentWriter.close(BufferedContentWriter.java:72)
              at co.cask.cdap.data.stream.service.upload.LengthBasedContentWriter.close(LengthBasedContentWriter.java:101)
              at co.cask.cdap.data.stream.service.upload.TextStreamBodyConsumer.finished(TextStreamBodyConsumer.java:94)
              at co.cask.http.HttpMethodInfo.bodyConsumerFinish(HttpMethodInfo.java:149)
              at co.cask.http.HttpMethodInfo.chunk(HttpMethodInfo.java:108)
              at co.cask.http.HttpDispatcher.messageReceived(HttpDispatcher.java:40)
              at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
              at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
              at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
              at org.jboss.netty.channel.SimpleChannelUpstreamHandler.messageReceived(SimpleChannelUpstreamHandler.java:124)
              at co.cask.cdap.common.http.AuthenticationChannelHandler.messageReceived(AuthenticationChannelHandler.java:64)
              at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
              at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
              at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
              at org.jboss.netty.channel.SimpleChannelUpstreamHandler.messageReceived(SimpleChannelUpstreamHandler.java:124)
              at co.cask.http.RequestRouter.messageReceived(RequestRouter.java:79)
              at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
              at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
              at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
              at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
              at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
              at org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
              at org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
              at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
              at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
              at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
              at org.jboss.netty.handler.codec.http.HttpContentEncoder.messageReceived(HttpContentEncoder.java:69)
              at org.jboss.netty.channel.SimpleChannelHandler.handleUpstream(SimpleChannelHandler.java:88)
              at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
              at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
              at org.jboss.netty.channel.SimpleChannelUpstreamHandler.messageReceived(SimpleChannelUpstreamHandler.java:124)
              at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
              at co.cask.http.NettyHttpService$2.handleUpstream(NettyHttpService.java:205)
              at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
              at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
              at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
              at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
              at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
              at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:109)
              at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
              at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90)
              at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
              at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
              at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
              at java.lang.Thread.run(Thread.java:745)
      

      This can be seen in master logs every second (sometimes every other second):

      2017-06-22 21:00:36,943 - WARN  [LeaseRenewer:cdap@10.250.0.6:o.a.h.s.SaslRpcClient@375] - Exception encountered while connecting to the server : 
      org.apache.hadoop.ipc.RemoteException: token (HDFS_DELEGATION_TOKEN token 20402 for cdap) is expired
              at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:375)
              at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:558)
              at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:373)
              at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:727)
              at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:723)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:415)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
              at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:722)
              at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:373)
              at org.apache.hadoop.ipc.Client.getConnection(Client.java:1493)
              at org.apache.hadoop.ipc.Client.call(Client.java:1397)
              at org.apache.hadoop.ipc.Client.call(Client.java:1358)
              at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
              at com.sun.proxy.$Proxy9.renewLease(Unknown Source)
              at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.renewLease(ClientNamenodeProtocolTranslatorPB.java:590)
              at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:606)
              at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:252)
              at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
              at com.sun.proxy.$Proxy10.renewLease(Unknown Source)
              at org.apache.hadoop.hdfs.DFSClient.renewLease(DFSClient.java:892)
              at org.apache.hadoop.hdfs.LeaseRenewer.renew(LeaseRenewer.java:417)
              at org.apache.hadoop.hdfs.LeaseRenewer.run(LeaseRenewer.java:442)
              at org.apache.hadoop.hdfs.LeaseRenewer.access$700(LeaseRenewer.java:71)
              at org.apache.hadoop.hdfs.LeaseRenewer$1.run(LeaseRenewer.java:298)
              at java.lang.Thread.run(Thread.java:745)
      2017-06-22 21:00:36,946 - WARN  [LeaseRenewer:cdap@10.250.0.6:o.a.h.i.Client@1427] - Failed to renew lease for [DFSClient_NONMAPREDUCE_552206182_52] for 372 seconds.  Will retry shortly ...
      org.apache.hadoop.ipc.RemoteException: token (HDFS_DELEGATION_TOKEN token 20402 for cdap) is expired
              at org.apache.hadoop.ipc.Client.call(Client.java:1427)
              at org.apache.hadoop.ipc.Client.call(Client.java:1358)
              at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
              at com.sun.proxy.$Proxy9.renewLease(Unknown Source)
              at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.renewLease(ClientNamenodeProtocolTranslatorPB.java:590)
              at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:606)
              at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:252)
              at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
              at com.sun.proxy.$Proxy10.renewLease(Unknown Source)
              at org.apache.hadoop.hdfs.DFSClient.renewLease(DFSClient.java:892)
              at org.apache.hadoop.hdfs.LeaseRenewer.renew(LeaseRenewer.java:417)
              at org.apache.hadoop.hdfs.LeaseRenewer.run(LeaseRenewer.java:442)
              at org.apache.hadoop.hdfs.LeaseRenewer.access$700(LeaseRenewer.java:71)
              at org.apache.hadoop.hdfs.LeaseRenewer$1.run(LeaseRenewer.java:298)
              at java.lang.Thread.run(Thread.java:745)
      

      The first occurrence was at "2017-06-22 20:54:54,309", and after one hour, the file closes on its own (as per the logs below). In this scenario, I did not perform any stream write after the failed stream write.

      2017-06-22 21:54:24,540 - WARN  [LeaseRenewer:cdap@10.250.0.6:o.a.h.i.Client@1427] - Failed to renew lease for DFSClient_NONMAPREDUCE_552206182_52 for 3600 seconds (>= hard-limit =3600 seconds.) Closing all files being written ...
      org.apache.hadoop.ipc.RemoteException: token (HDFS_DELEGATION_TOKEN token 20402 for cdap) can't be found in cache
              at org.apache.hadoop.ipc.Client.call(Client.java:1427)
              at org.apache.hadoop.ipc.Client.call(Client.java:1358)
              at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
              at com.sun.proxy.$Proxy9.renewLease(Unknown Source)
              at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.renewLease(ClientNamenodeProtocolTranslatorPB.java:590)
              at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:606)
              at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:252)
              at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
              at com.sun.proxy.$Proxy10.renewLease(Unknown Source)
              at org.apache.hadoop.hdfs.DFSClient.renewLease(DFSClient.java:892)
              at org.apache.hadoop.hdfs.LeaseRenewer.renew(LeaseRenewer.java:417)
              at org.apache.hadoop.hdfs.LeaseRenewer.run(LeaseRenewer.java:442)
              at org.apache.hadoop.hdfs.LeaseRenewer.access$700(LeaseRenewer.java:71)
              at org.apache.hadoop.hdfs.LeaseRenewer$1.run(LeaseRenewer.java:298)
              at java.lang.Thread.run(Thread.java:745)
      

        Attachments

          Activity

            People

            • Assignee:
              bhooshan Bhooshan Mogal
              Reporter:
              ali.anwar Ali Anwar
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: