Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-16055

Spark pipeline lacks error message when there are output directory conflicts

    Details

    • Release Notes:
      Fixed a bug that the failure error message emitted by Spark driver is not being collected.
    • Rank:
      1|i00s7j:

      Description

      When I run multiple pipelines, each writing to the same directory, using the MapReduce engine, a subset may fail due to the others writing to the same output directory. Note that this pipeline writes to a subdirectory partitioned by minute, so it is possible for more than one of the set to succeed, if they run/complete at different minutes.
      I can appropriately see the error message in the logs (see failedMR.txt for full pipeline logs):

      Caused by: org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory gs://test-new-df-folder/tmp/2019-10-21-14-16 already exists

      However, when I do the same in Spark execution engine, I do not see such an error message, but the pipelines can still fail. See failedSpark[1-5].txt for logs for such pipeline runs.

      I did encounter 1 run where the error message is appropriately logged. See failedSpark6.txt.

      I have attached the pipeline as q-cdpa-data-pipeline.json.

        Attachments

        1. failedMR.txt
          62 kB
        2. failedSpark1.txt
          109 kB
        3. failedSpark2.txt
          93 kB
        4. failedSpark3.txt
          89 kB
        5. failedSpark5.txt
          94 kB
        6. failedSpark6.txt
          96 kB
        7. q-cdap-data-pipeline.json
          7 kB

          Activity

            People

            • Assignee:
              terence Terence Yim
              Reporter:
              ali.anwar Ali Anwar
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: