Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-13321

After Hive Metastore restart, Explore is in a fail loop due to an expired delegation token

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 4.3.3
    • Fix Version/s: None
    • Component/s: Explore, Master, Security
    • Labels:
    • Rank:
      1|i00c3b:

      Description

      When the meta store is restarted, it loses all delegation tokens that were previously acquired, and all clients must  obtain new delegation tokens. However, in CDAP explore.service, no new token is acquired. Instead, we see, an exception in the logs:

      2018-04-18 16:54:01,637 - WARN  [Heartbeater-2:o.a.h.h.m.HiveMetaStoreClient@492] - MetaStoreClient lost connection. Attempting to reconnect.org.apache.hadoop.hive.metastore.api.MetaException: Could not connect to meta store using any of the URIs provided. Most recent failure: org.apache.thrift.transport.TTransportException: Peer indicated failure: DIGEST-MD5: IO error acquiring password        at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:199)        at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:277)        at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)        at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)        at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)        at java.security.AccessController.doPrivileged(Native Method) 
      ...

      and subsequently, all queries fail to ever produce results and remain in RUNNING state. The hive client will periodically log that it could not connect to the meta store and sleep for 5 seconds, then repeat. But all queries remain in running state forever.

      While that is a HIve issue (queries should actually fail), CDAP should have a way to recognize that its delegation token has expired and request a new one. 

      Currently this requires a restart of CDAP - actually a shutdown and start, to make sure the master.services app is restarted with new tokens. 

      If this cannot happen automatically, there should at least be a way to surface the problem (show explore.service as not healthy in UI?) and have a way to manually trigger renewal of the delegation token,

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                bhooshan Bhooshan Mogal
                Reporter:
                andreas Andreas Neumann
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated: