Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-12975

rogue containers

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 4.3.2
    • Fix Version/s: 6.1.0
    • Component/s: CDAP, CDAP Services
    • Labels:
    • Rank:
      1|i00a2f:

      Description

      We have seen rogue containers on clusters that have the Yarn property `yarn.resourcemanager.recovery.enabled` set to true. This property allows the containers to continue running after Yarn has stopped or crashed. If Yarn doesn't detect the container on startup, the container will not be managed by Yarn.

      This generally happens on long running containers and the easiest way to find them is:

      • Stop all running applications
      • Stop cdap
      • Login to node managers and run `ps auxww | grep cdap`
      • kill any containers listed

        Attachments

          Activity

            People

            • Assignee:
              poorna Poorna Chandra
              Reporter:
              mattwuenschel Matt Wuenschel
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: