Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-8841

Twill jars not getting cleaned up on program failure

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 6.2.0
    • Fix Version/s: None
    • Component/s: App Fabric
    • Labels:
    • Rank:
      1|hzzwuf:

      Description

      We have seen situations where a program run fails because the container is killed by YARN. In the particular instance I was looking at, the Workflow driver container was getting killed after 10 minutes, but the yarn logs had disappeared by the time I got to the cluster, so could not determine the root cause of the failure.

      In any case, when this happens, the twill jars are not cleaned up correctly. You can see a bunch under /cdap/twill. This can quickly fill up hdfs if you have many scheduled workflows, and they all fail in this way for a period of time.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                terence Terence Yim
                Reporter:
                ashau Albert Shau
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated: