Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-13158

Concurrent launch of the Spark programs from fork of the Workflow may fail sometimes.

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 4.3.3, 4.1.3
    • Fix Version/s: 4.3.4
    • Component/s: CDAP, Spark, Workflow
    • Labels:
    • Release Notes:
      Fixed a race condition when running multiple spark programs concurrently at a Workflow fork that can lead to workflow failure
    • Rank:
      1|i00b4n:

      Description

      When Spark programs are launched as a part of Workflow fork, creation of the zip archival file which is to be submitted to the YARN may be attempted by multiple threads concurrently resulting in its corruption. While unzipping such file gives error as

      2018-03-07 06:30:39,662 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: { hdfs://user/bigdata-app-svc/.sparkStaging/application_1512706358976_369592/__spark_conf__.zip, 1520425835427, ARCHIVE, null } failed: error in opening zip file
      java.util.zip.ZipException: error in opening zip file
              at java.util.zip.ZipFile.open(Native Method)
              at java.util.zip.ZipFile.<init>(ZipFile.java:219)
              at java.util.zip.ZipFile.<init>(ZipFile.java:149)
              at java.util.zip.ZipFile.<init>(ZipFile.java:163)
              at org.apache.hadoop.fs.FileUtil.unZip(FileUtil.java:592)
              at org.apache.hadoop.yarn.util.FSDownload.unpack(FSDownload.java:277)
              at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:362)
              at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60)
              at java.util.concurrent.FutureTask.run(FutureTask.java:266)
              at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
              at java.util.concurrent.FutureTask.run(FutureTask.java:266)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
              at java.lang.Thread.run(Thread.java:745)
      

        Attachments

          Activity

            People

            • Assignee:
              terence Terence Yim
              Reporter:
              sagar Sagar Kapare
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: